DataFrameMacros.jl

DataFrameMacros.jl offers macros for manipulating DataFrames with a syntax geared towards clarity, brevity and convenience. Each macro translates expressions into the source .=> function .=> sink mini-language from DataFrames.jl.

The following macros are currently available:

  • @transform / @transform!
  • @select / @select!
  • @groupby
  • @combine
  • @subset / @subset!
  • @sort / @sort!
  • @unique

Differences to DataFramesMeta.jl

  • Except @combine, all macros work row-wise by default in DataFrameMacros.jl
  • DataFrameMacros.jl uses {} to signal column expressions instead of $().
  • In DataFrameMacros.jl, you can apply the same expression to several columns in {} braces at once and even broadcast across multiple sets of columns.
  • In DataFrameMacros.jl, you can use special {{ }} multi-column expressions where you can operate on a tuple of all values at once which makes it easier to do aggregates across columns.
  • DataFrameMacros.jl has a special syntax to make use of transform! on a view returned from subset, so you can easily transform only some rows of your dataset with @transform!(df, @subset(...), ...).

If any of these points have changed, please open an issue.

Examples

@select

julia> using DataFrames
julia> using DataFrameMacros
julia> using Statistics
julia> df = DataFrame(a = 1:5, b = 6:10, c = 11:15)5×3 DataFrame Row │ a b c │ Int64 Int64 Int64 ─────┼───────────────────── 1 │ 1 6 11 2 │ 2 7 12 3 │ 3 8 13 4 │ 4 9 14 5 │ 5 10 15
julia> @select(df, :a)5×1 DataFrame Row │ a │ Int64 ─────┼─────── 1 │ 1 2 │ 2 3 │ 3 4 │ 4 5 │ 5
julia> @select(df, :a, :b)5×2 DataFrame Row │ a b │ Int64 Int64 ─────┼────────────── 1 │ 1 6 2 │ 2 7 3 │ 3 8 4 │ 4 9 5 │ 5 10
julia> @select(df, :A = :a, :B = :b)5×2 DataFrame Row │ A B │ Int64 Int64 ─────┼────────────── 1 │ 1 6 2 │ 2 7 3 │ 3 8 4 │ 4 9 5 │ 5 10
julia> @select(df, :a + 1)5×1 DataFrame Row │ a_function │ Int64 ─────┼──────────── 1 │ 2 2 │ 3 3 │ 4 4 │ 5 5 │ 6
julia> @select(df, :a_plus_one = :a + 1)5×1 DataFrame Row │ a_plus_one │ Int64 ─────┼──────────── 1 │ 2 2 │ 3 3 │ 4 4 │ 5 5 │ 6
julia> @select(df, {[:a, :b]} / 2)5×2 DataFrame Row │ a_function b_function │ Float64 Float64 ─────┼──────────────────────── 1 │ 0.5 3.0 2 │ 1.0 3.5 3 │ 1.5 4.0 4 │ 2.0 4.5 5 │ 2.5 5.0
julia> @select(df, sqrt({Not(:b)}))5×2 DataFrame Row │ a_sqrt c_sqrt │ Float64 Float64 ─────┼────────────────── 1 │ 1.0 3.31662 2 │ 1.41421 3.4641 3 │ 1.73205 3.60555 4 │ 2.0 3.74166 5 │ 2.23607 3.87298
julia> @select(df, 5 * {All()})5×3 DataFrame Row │ a_function b_function c_function │ Int64 Int64 Int64 ─────┼──────────────────────────────────── 1 │ 5 30 55 2 │ 10 35 60 3 │ 15 40 65 4 │ 20 45 70 5 │ 25 50 75
julia> @select(df, {Between(1, 2)} - {Between(2, 3)})5×2 DataFrame Row │ a_b_- b_c_- │ Int64 Int64 ─────┼────────────── 1 │ -5 -5 2 │ -5 -5 3 │ -5 -5 4 │ -5 -5 5 │ -5 -5
julia> @select(df, "{1}_plus_{2}" = {Between(1, 2)} + {Between(2, 3)})5×2 DataFrame Row │ a_plus_b b_plus_c │ Int64 Int64 ─────┼──────────────────── 1 │ 7 17 2 │ 9 19 3 │ 11 21 4 │ 13 23 5 │ 15 25
julia> @select(df, @bycol :a .- :b)ERROR: UndefVarError: .- not defined
julia> @select(df, :d = @bycol :a .+ 1)5×1 DataFrame Row │ d │ Int64 ─────┼─────── 1 │ 2 2 │ 3 3 │ 4 4 │ 5 5 │ 6
julia> @select(df, "a_minus_{2}" = :a - {[:b, :c]})5×2 DataFrame Row │ a_minus_b a_minus_c │ Int64 Int64 ─────┼────────────────────── 1 │ -5 -10 2 │ -5 -10 3 │ -5 -10 4 │ -5 -10 5 │ -5 -10
julia> @select(df, "{1}_minus_{2}" = {[:a, :b, :c]} - {[:a, :b, :c]'})5×9 DataFrame Row │ a_minus_a b_minus_a c_minus_a a_minus_b b_minus_b c_minus_b a_min ⋯ │ Int64 Int64 Int64 Int64 Int64 Int64 Int64 ⋯ ─────┼────────────────────────────────────────────────────────────────────────── 1 │ 0 5 10 -5 0 5 ⋯ 2 │ 0 5 10 -5 0 5 3 │ 0 5 10 -5 0 5 4 │ 0 5 10 -5 0 5 5 │ 0 5 10 -5 0 5 ⋯ 3 columns omitted
julia> @select(df, :a + mean({{[:b, :c]}}))5×1 DataFrame Row │ a_b_c_function │ Float64 ─────┼──────────────── 1 │ 9.5 2 │ 11.5 3 │ 13.5 4 │ 15.5 5 │ 17.5

@transform

julia> using DataFrames
julia> using DataFrameMacros
julia> using Statistics
julia> df = DataFrame(a = 1:5, b = 6:10, c = 11:15)5×3 DataFrame Row │ a b c │ Int64 Int64 Int64 ─────┼───────────────────── 1 │ 1 6 11 2 │ 2 7 12 3 │ 3 8 13 4 │ 4 9 14 5 │ 5 10 15
julia> @transform(df, :a + 1)5×4 DataFrame Row │ a b c a_function │ Int64 Int64 Int64 Int64 ─────┼───────────────────────────────── 1 │ 1 6 11 2 2 │ 2 7 12 3 3 │ 3 8 13 4 4 │ 4 9 14 5 5 │ 5 10 15 6
julia> @transform(df, :a_plus_one = :a + 1)5×4 DataFrame Row │ a b c a_plus_one │ Int64 Int64 Int64 Int64 ─────┼───────────────────────────────── 1 │ 1 6 11 2 2 │ 2 7 12 3 3 │ 3 8 13 4 4 │ 4 9 14 5 5 │ 5 10 15 6
julia> @transform(df, @bycol :a .- mean(:b))5×4 DataFrame Row │ a b c a_b_function │ Int64 Int64 Int64 Float64 ─────┼─────────────────────────────────── 1 │ 1 6 11 -7.0 2 │ 2 7 12 -6.0 3 │ 3 8 13 -5.0 4 │ 4 9 14 -4.0 5 │ 5 10 15 -3.0
julia> @transform(df, :d = @bycol :a .+ 1)5×4 DataFrame Row │ a b c d │ Int64 Int64 Int64 Int64 ─────┼──────────────────────────── 1 │ 1 6 11 2 2 │ 2 7 12 3 3 │ 3 8 13 4 4 │ 4 9 14 5 5 │ 5 10 15 6
julia> @transform(df, "a_minus_{2}" = :a - {[:b, :c]})5×5 DataFrame Row │ a b c a_minus_b a_minus_c │ Int64 Int64 Int64 Int64 Int64 ─────┼─────────────────────────────────────────── 1 │ 1 6 11 -5 -10 2 │ 2 7 12 -5 -10 3 │ 3 8 13 -5 -10 4 │ 4 9 14 -5 -10 5 │ 5 10 15 -5 -10
julia> @transform(df, "{1}_minus_{2}" = {[:a, :b, :c]} - {[:a, :b, :c]'})5×12 DataFrame Row │ a b c a_minus_a b_minus_a c_minus_a a_minus_b b_minu ⋯ │ Int64 Int64 Int64 Int64 Int64 Int64 Int64 Int64 ⋯ ─────┼────────────────────────────────────────────────────────────────────────── 1 │ 1 6 11 0 5 10 -5 ⋯ 2 │ 2 7 12 0 5 10 -5 3 │ 3 8 13 0 5 10 -5 4 │ 4 9 14 0 5 10 -5 5 │ 5 10 15 0 5 10 -5 ⋯ 5 columns omitted

@combine

julia> using DataFrames
julia> using DataFrameMacros
julia> using Statistics
julia> df = DataFrame(a = 1:5, b = 6:10, c = 11:15)5×3 DataFrame Row │ a b c │ Int64 Int64 Int64 ─────┼───────────────────── 1 │ 1 6 11 2 │ 2 7 12 3 │ 3 8 13 4 │ 4 9 14 5 │ 5 10 15
julia> @combine(df, :mean_a = mean(:a))1×1 DataFrame Row │ mean_a │ Float64 ─────┼───────── 1 │ 3.0
julia> @combine(df, "mean_{}" = mean({All()}))1×3 DataFrame Row │ mean_a mean_b mean_c │ Float64 Float64 Float64 ─────┼─────────────────────────── 1 │ 3.0 8.0 13.0
julia> @combine(df, "first_3_{}" = first({Not(:b)}, 3))3×2 DataFrame Row │ first_3_a first_3_c │ Int64 Int64 ─────┼────────────────────── 1 │ 1 11 2 │ 2 12 3 │ 3 13
julia> @combine(df, begin :mean_a = mean(:a) :median_b = median(:b) :sum_c = sum(:c) end)1×3 DataFrame Row │ mean_a median_b sum_c │ Float64 Float64 Int64 ─────┼────────────────────────── 1 │ 3.0 8.0 65

@sort

julia> using DataFrames
julia> using DataFrameMacros
julia> using Random
julia> Random.seed!(123)MersenneTwister(123)
julia> df = DataFrame(randn(5, 5), :auto)5×5 DataFrame Row │ x1 x2 x3 x4 x5 │ Float64 Float64 Float64 Float64 Float64 ─────┼──────────────────────────────────────────────────────── 1 │ 1.19027 -0.664713 -0.339366 0.368002 -0.979539 2 │ 2.04818 0.980968 -0.843878 -0.281133 0.260402 3 │ 1.14265 -0.0754831 -0.888936 -0.734886 -0.468489 4 │ 0.459416 0.273815 0.327215 -0.71741 -0.880897 5 │ -0.396679 -0.194229 0.592403 -0.77507 0.277726
julia> @sort(df, :x1)5×5 DataFrame Row │ x1 x2 x3 x4 x5 │ Float64 Float64 Float64 Float64 Float64 ─────┼──────────────────────────────────────────────────────── 1 │ -0.396679 -0.194229 0.592403 -0.77507 0.277726 2 │ 0.459416 0.273815 0.327215 -0.71741 -0.880897 3 │ 1.14265 -0.0754831 -0.888936 -0.734886 -0.468489 4 │ 1.19027 -0.664713 -0.339366 0.368002 -0.979539 5 │ 2.04818 0.980968 -0.843878 -0.281133 0.260402
julia> @sort(df, -:x1)5×5 DataFrame Row │ x1 x2 x3 x4 x5 │ Float64 Float64 Float64 Float64 Float64 ─────┼──────────────────────────────────────────────────────── 1 │ 2.04818 0.980968 -0.843878 -0.281133 0.260402 2 │ 1.19027 -0.664713 -0.339366 0.368002 -0.979539 3 │ 1.14265 -0.0754831 -0.888936 -0.734886 -0.468489 4 │ 0.459416 0.273815 0.327215 -0.71741 -0.880897 5 │ -0.396679 -0.194229 0.592403 -0.77507 0.277726
julia> @sort(df, :x2 * :x3)5×5 DataFrame Row │ x1 x2 x3 x4 x5 │ Float64 Float64 Float64 Float64 Float64 ─────┼──────────────────────────────────────────────────────── 1 │ 2.04818 0.980968 -0.843878 -0.281133 0.260402 2 │ -0.396679 -0.194229 0.592403 -0.77507 0.277726 3 │ 1.14265 -0.0754831 -0.888936 -0.734886 -0.468489 4 │ 0.459416 0.273815 0.327215 -0.71741 -0.880897 5 │ 1.19027 -0.664713 -0.339366 0.368002 -0.979539
julia> df2 = DataFrame(a = [1, 2, 2, 1, 2], b = [4, 4, 4, 3, 3], c = [5, 7, 5, 7, 5])5×3 DataFrame Row │ a b c │ Int64 Int64 Int64 ─────┼───────────────────── 1 │ 1 4 5 2 │ 2 4 7 3 │ 2 4 5 4 │ 1 3 7 5 │ 2 3 5
julia> @sort(df2, :a, :b)5×3 DataFrame Row │ a b c │ Int64 Int64 Int64 ─────┼───────────────────── 1 │ 1 3 7 2 │ 1 4 5 3 │ 2 3 5 4 │ 2 4 7 5 │ 2 4 5
julia> @sort(df2, :c - :a - :b)5×3 DataFrame Row │ a b c │ Int64 Int64 Int64 ─────┼───────────────────── 1 │ 2 4 5 2 │ 1 4 5 3 │ 2 3 5 4 │ 2 4 7 5 │ 1 3 7

@groupby

julia> using DataFrames
julia> using DataFrameMacros
julia> using Random
julia> Random.seed!(123)MersenneTwister(123)
julia> df = DataFrame( color = ["red", "red", "red", "blue", "blue"], size = ["big", "small", "big", "small", "big"], height = [1, 2, 3, 4, 5], )5×3 DataFrame Row │ color size height │ String String Int64 ─────┼──────────────────────── 1 │ red big 1 2 │ red small 2 3 │ red big 3 4 │ blue small 4 5 │ blue big 5
julia> @groupby(df, :color)GroupedDataFrame with 2 groups based on key: color First Group (3 rows): color = "red" Row │ color size height │ String String Int64 ─────┼──────────────────────── 1 │ red big 1 2 │ red small 2 3 │ red big 3 ⋮ Last Group (2 rows): color = "blue" Row │ color size height │ String String Int64 ─────┼──────────────────────── 1 │ blue small 4 2 │ blue big 5
julia> @groupby(df, :color, :size)GroupedDataFrame with 4 groups based on keys: color, size First Group (2 rows): color = "red", size = "big" Row │ color size height │ String String Int64 ─────┼──────────────────────── 1 │ red big 1 2 │ red big 3 ⋮ Last Group (1 row): color = "blue", size = "big" Row │ color size height │ String String Int64 ─────┼──────────────────────── 1 │ blue big 5
julia> @groupby(df, :evenheight = iseven(:height))GroupedDataFrame with 2 groups based on key: evenheight First Group (3 rows): evenheight = false Row │ color size height evenheight │ String String Int64 Bool ─────┼──────────────────────────────────── 1 │ red big 1 false 2 │ red big 3 false 3 │ blue big 5 false ⋮ Last Group (2 rows): evenheight = true Row │ color size height evenheight │ String String Int64 Bool ─────┼──────────────────────────────────── 1 │ red small 2 true 2 │ blue small 4 true

@astable

julia> using DataFrames
julia> using DataFrameMacros
julia> df = DataFrame(name = ["Jeff Bezanson", "Stefan Karpinski", "Alan Edelman", "Viral Shah"])4×1 DataFrame Row │ name │ String ─────┼────────────────── 1 │ Jeff Bezanson 2 │ Stefan Karpinski 3 │ Alan Edelman 4 │ Viral Shah
julia> @select(df, @astable :first, :last = split(:name))4×2 DataFrame Row │ first last │ SubStrin… SubStrin… ─────┼────────────────────── 1 │ Jeff Bezanson 2 │ Stefan Karpinski 3 │ Alan Edelman 4 │ Viral Shah
julia> @select(df, @astable begin f, l = split(:name) :first, :last = f, l :initials = first(f) * "." * first(l) * "." end)4×3 DataFrame Row │ first last initials │ SubStrin… SubStrin… String ─────┼──────────────────────────────── 1 │ Jeff Bezanson J.B. 2 │ Stefan Karpinski S.K. 3 │ Alan Edelman A.E. 4 │ Viral Shah V.S.

@passmissing

julia> using DataFrames
julia> using DataFrameMacros
julia> df = DataFrame(short = ["cat", "dog", "mouse", "duck"], long = ["catch", "dogged", missing, "docks"])4×2 DataFrame Row │ short long │ String String? ─────┼───────────────── 1 │ cat catch 2 │ dog dogged 3 │ mouse missing 4 │ duck docks
julia> @transform(df, :startswith = @passmissing startswith(:long, :short))4×3 DataFrame Row │ short long startswith │ String String? Bool? ─────┼───────────────────────────── 1 │ cat catch true 2 │ dog dogged true 3 │ mouse missing missing 4 │ duck docks false

Multiple columns in {}

If {} contains a multi-column expression, then the function is run for each combination of arguments determined by broadcasting all sets together.

julia> using DataFrames
julia> using DataFrameMacros
julia> using Statistics
julia> df = DataFrame(a = 1:5, b = 6:10, c = 11:15)5×3 DataFrame Row │ a b c │ Int64 Int64 Int64 ─────┼───────────────────── 1 │ 1 6 11 2 │ 2 7 12 3 │ 3 8 13 4 │ 4 9 14 5 │ 5 10 15
julia> @select(df, :a + {[:b, :c]})5×2 DataFrame Row │ a_b_+ a_c_+ │ Int64 Int64 ─────┼────────────── 1 │ 7 12 2 │ 9 14 3 │ 11 16 4 │ 13 18 5 │ 15 20
julia> @select(df, :a + {Not(:a)})5×2 DataFrame Row │ a_b_+ a_c_+ │ Int64 Int64 ─────┼────────────── 1 │ 7 12 2 │ 9 14 3 │ 11 16 4 │ 13 18 5 │ 15 20
julia> @select(df, {[:a, :b]} + {[:b, :c]})5×2 DataFrame Row │ a_b_+ b_c_+ │ Int64 Int64 ─────┼────────────── 1 │ 7 17 2 │ 9 19 3 │ 11 21 4 │ 13 23 5 │ 15 25
julia> @select(df, {[:a, :b]} + {[:b, :c]'})5×4 DataFrame Row │ a_b_+ b_b_+ a_c_+ b_c_+ │ Int64 Int64 Int64 Int64 ─────┼──────────────────────────── 1 │ 7 12 12 17 2 │ 9 14 14 19 3 │ 11 16 16 21 4 │ 13 18 18 23 5 │ 15 20 20 25

{{}} syntax

The double brace syntax refers to multiple columns as a tuple, which means that you can aggregate over a larger number of columns than it would be practical to write out explicitly.

julia> using DataFrames
julia> using DataFrameMacros
julia> using Random
julia> using Statistics
julia> Random.seed!(123)MersenneTwister(123)
julia> df = DataFrame( jan = randn(5), feb = randn(5), mar = randn(5), apr = randn(5), may = randn(5), jun = randn(5), jul = randn(5), )5×7 DataFrame Row │ jan feb mar apr may jun jul ⋯ │ Float64 Float64 Float64 Float64 Float64 Float64 Floa ⋯ ─────┼────────────────────────────────────────────────────────────────────────── 1 │ 1.19027 -0.664713 -0.339366 0.368002 -0.979539 1.52392 -0.8 ⋯ 2 │ 2.04818 0.980968 -0.843878 -0.281133 0.260402 -1.77773 0.3 3 │ 1.14265 -0.0754831 -0.888936 -0.734886 -0.468489 -2.93306 -0.1 4 │ 0.459416 0.273815 0.327215 -0.71741 -0.880897 0.782258 2.3 5 │ -0.396679 -0.194229 0.592403 -0.77507 0.277726 2.31358 -0.9 ⋯ 1 column omitted
julia> @select(df, :july_larger = :jul > median({{Between(:jan, :jun)}}))5×1 DataFrame Row │ july_larger │ Bool ─────┼───────────── 1 │ false 2 │ true 3 │ true 4 │ true 5 │ false
julia> @select(df, :mean_smaller = mean({{All()}}) < median({{All()}}))5×1 DataFrame Row │ mean_smaller │ Bool ─────┼────────────── 1 │ false 2 │ true 3 │ true 4 │ false 5 │ false

@transform! on @subset

DataFrames.jl allows transform!ing a view returned by subset(df, ..., view = true). If you pass a @subset macro call without a dataframe argument to @transform!, a view is created automatically, then the transform is executed and the original argument returned.

julia> using DataFrames
julia> using DataFrameMacros
julia> using Statistics
julia> df = DataFrame( name = ["Chicken", "Pork", "Apple", "Pear", "Beef"], type = ["Meat", "Meat", "Fruit", "Fruit", "Meat"], price = [4.99, 5.99, 0.99, 1.29, 6.99], )5×3 DataFrame Row │ name type price │ String String Float64 ─────┼────────────────────────── 1 │ Chicken Meat 4.99 2 │ Pork Meat 5.99 3 │ Apple Fruit 0.99 4 │ Pear Fruit 1.29 5 │ Beef Meat 6.99
julia> @transform!(df, @subset(:type == "Meat"), :price = :price + 2)5×3 DataFrame Row │ name type price │ String String Float64 ─────┼────────────────────────── 1 │ Chicken Meat 6.99 2 │ Pork Meat 7.99 3 │ Apple Fruit 0.99 4 │ Pear Fruit 1.29 5 │ Beef Meat 8.99
julia> @transform!(df, @subset(:price < 7, :name != "Pear"), :n_sold = round(Int, :price * 5))5×4 DataFrame Row │ name type price n_sold │ String String Float64 Int64? ─────┼─────────────────────────────────── 1 │ Chicken Meat 6.99 35 2 │ Pork Meat 7.99 missing 3 │ Apple Fruit 0.99 5 4 │ Pear Fruit 1.29 missing 5 │ Beef Meat 8.99 missing
julia> @transform!( @groupby(df, :type), @subset(@bycol :price .< mean(:price)), :price = 100 * :price)5×4 DataFrame Row │ name type price n_sold │ String String Float64 Int64? ─────┼─────────────────────────────────── 1 │ Chicken Meat 699.0 35 2 │ Pork Meat 7.99 missing 3 │ Apple Fruit 99.0 5 4 │ Pear Fruit 1.29 missing 5 │ Beef Meat 8.99 missing

Special case @nrow

julia> using DataFrames
julia> using DataFrameMacros
julia> using Statistics
julia> df = DataFrame(x = [1, 1, 1, 2, 2])5×1 DataFrame Row │ x │ Int64 ─────┼─────── 1 │ 1 2 │ 1 3 │ 1 4 │ 2 5 │ 2
julia> @transform(df, @nrow)5×2 DataFrame Row │ x nrow │ Int64 Int64 ─────┼────────────── 1 │ 1 5 2 │ 1 5 3 │ 1 5 4 │ 2 5 5 │ 2 5
julia> @combine(groupby(df, :x), :count = @nrow)2×2 DataFrame Row │ x count │ Int64 Int64 ─────┼────────────── 1 │ 1 3 2 │ 2 2

Special case @eachindex

julia> using DataFrames
julia> using DataFrameMacros
julia> using Statistics
julia> df = DataFrame(x = [1, 1, 1, 2, 2])5×1 DataFrame Row │ x │ Int64 ─────┼─────── 1 │ 1 2 │ 1 3 │ 1 4 │ 2 5 │ 2
julia> @transform(df, @eachindex)5×2 DataFrame Row │ x eachindex │ Int64 Int64 ─────┼────────────────── 1 │ 1 1 2 │ 1 2 3 │ 1 3 4 │ 2 4 5 │ 2 5
julia> @combine(groupby(df, :x), :i = @eachindex)5×2 DataFrame Row │ x i │ Int64 Int64 ─────┼────────────── 1 │ 1 1 2 │ 1 2 3 │ 1 3 4 │ 2 1 5 │ 2 2

Special case @proprow

julia> using DataFrames
julia> using DataFrameMacros
julia> using Statistics
julia> df = DataFrame(x = [1, 1, 1, 2, 2])5×1 DataFrame Row │ x │ Int64 ─────┼─────── 1 │ 1 2 │ 1 3 │ 1 4 │ 2 5 │ 2
julia> @combine(groupby(df, :x), :p = @proprow)2×2 DataFrame Row │ x p │ Int64 Float64 ─────┼──────────────── 1 │ 1 0.6 2 │ 2 0.4

Special case @groupindices

julia> using DataFrames
julia> using DataFrameMacros
julia> using Statistics
julia> df = DataFrame(x = [1, 1, 1, 2, 2])5×1 DataFrame Row │ x │ Int64 ─────┼─────── 1 │ 1 2 │ 1 3 │ 1 4 │ 2 5 │ 2
julia> @combine(groupby(df, :x), :gi = @groupindices)2×2 DataFrame Row │ x gi │ Int64 Int64 ─────┼────────────── 1 │ 1 1 2 │ 2 2