xtile_guide.md (1575B)
1 # Xtile 2 3 The function `xtile` tries to emulate stata [xtile](https://www.stata.com/manuals/dpctile.pdf) function. 4 5 There is a [`BinScatter.jl`](https://github.com/matthieugomez/Binscatters.jl) package which already implements these features. 6 7 8 ```@setup hist 9 import Pkg; Pkg.add("Plots"); 10 using Plots, Random, StatsBase, BazerData 11 gr(); theme(:wong2); Plots.default(display_type=:inline, size=(1250,750), thickness_scaling=1) 12 ``` 13 14 15 ## Basic usage 16 17 Start with a simple distribution to visualize the effect of *winsorizing* 18 ```@example hist 19 Random.seed!(3); x = randn(10_000); 20 p1 = histogram(x, 21 bins=-4:0.1:4, alpha=0.25, color="grey", label="", 22 framestyle=:box, size=(1250,750)) 23 savefig(p1, "p1.svg"); nothing # hide 24 ``` 25  26 27 28 The quintiles split the distribution: 29 ```@example hist; 30 x_tile = hcat(x, xtile(x, 5)) 31 p2 = histogram(x, bins=-4:0.1:4, alpha=0.25, color="grey", 32 label="", framestyle=:box); 33 [ histogram!(x_tile[ x_tile[:, 2] .== i , 1], bins=-4:0.1:4, 34 alpha=0.75, label="quantile bin $i") 35 for i in 0:4 ]; 36 savefig(p2, "p2.svg"); nothing # hide 37 ``` 38  39 40 41 It is possible to include weights 42 ```@example hist; 43 x_sorted = sort(x) 44 x_tile_weights = xtile(x_sorted, 5, 45 weights=Weights([ log(i)/i for i in 1:length(x)]) ) 46 p3 = histogram(x, bins=-4:0.1:4, alpha=0.25, color="grey", 47 label="", framestyle=:box); 48 [ histogram!(x_sorted[x_tile_weights.==i], bins=-4:0.1:4, 49 alpha=0.75, label="quantile bin $i") 50 for i in 0:4 ]; 51 savefig(p3, "p3.svg"); nothing # hide 52 ``` 53  54 55 56 57 58 59