wrds_advanced.md (3123B)
1 # Advanced WRDS 2 3 4 The package provides ready-built query to the postgres WRDS database. 5 The queries are the most common and used ones and allow to get a crsp/compustat dataset fairly quickly. 6 7 Some people will want to run their own queries. 8 This is a little outside the scope of the package but I keep this page as a list of examples that can be useful. 9 Post an issue if you want to add to the list. 10 11 We are always going to start by opening a connection to WRDS. 12 So for the rest of the examples I will assume that we include the preamble 13 ```julia 14 using FinanceRoutines 15 using DataFrames, DataPipes, Dates 16 import LibPQ 17 18 using BazerData # see https://github.com/louloulibs/BazerData.jl 19 wrds_conn = FinanceRoutines.open_wrds_pg(); 20 ``` 21 22 ## CRSP: Filtering on names 23 24 To get information on what columns are available, query the `information_schema`: 25 ```julia 26 postgre_query_columns= """ 27 SELECT * 28 FROM information_schema.columns 29 WHERE table_schema = 'crsp' 30 AND table_name = 'StkSecurityInfoHist'; 31 """ 32 msenames_columns = @p LibPQ.execute(wrds_conn, postgre_query_columns) |> 33 DataFrame |> __.column_name |> sort 34 ``` 35 36 Sometimes we want to know what are the tables available, for example the `meta` tables 37 ```julia 38 postgre_query = """ 39 SELECT table_name, table_schema, table_type 40 FROM information_schema.tables 41 WHERE table_name LIKE 'meta%' AND table_schema = 'crsp' 42 """ 43 LibPQ.execute(wrds_conn, postgre_query) |> DataFrame 44 ``` 45 46 47 Now imagine that we want to pull columns from the monthly stock file with the following conditions: a given time frame (e.g. the 2000s) and company names that contain "AP". 48 We would get the `permno` that match from `StkSecurityInfoHist` and match it on the fly to `msf` 49 50 ```julia 51 postgre_query = """ 52 SELECT msf.cusip, msf.permno, msf.mthcaldt, msf.mthprc, msf.mthret, msf.mthvol, msf.shrout, msf.siccd, 53 stkinfo.issuernm 54 FROM crsp.msf_v2 AS msf 55 INNER JOIN crsp.StkSecurityInfoHist AS stkinfo 56 ON msf.permno = stkinfo.permno 57 WHERE stkinfo.issuernm ~ '(^APPLE|TESLA)' 58 AND msf.mthcaldt >= '2010-01-01' 59 AND msf.mthcaldt <= '2019-12-31'; 60 """ 61 62 df_msf = LibPQ.execute(wrds_conn, postgre_query) |> DataFrame 63 tabulate(df_msf, [:permno, :issuernm]) 64 ``` 65 66 ```bash 67 permno issuernm │ Freq. Percent Cum Hist. 68 ────────────────────────────────────┼─────────────────────────────────────────────── 69 14593 APPLE COMPUTER INC │ 600 29.2 29 ███████████████████████▉ 70 14593 APPLE INC │ 600 29.2 58 ███████████████████████▉ 71 15338 APPLE HOSPITALITY REIT INC │ 280 13.6 72 ███████████▏ 72 93436 TESLA MOTORS INC │ 115 5.6 78 ████▋ 73 93436 TESLA INC │ 460 22.4 100 ██████████████████▎ 74 ``` 75 76 77 78 79