{"id":12632,"date":"2016-01-23T18:28:32","date_gmt":"2016-01-23T23:28:32","guid":{"rendered":"http:\/\/scruss.com\/blog\/?p=12632"},"modified":"2016-01-23T18:28:32","modified_gmt":"2016-01-23T23:28:32","slug":"q-absolves-data-sins-and-makes-csv-queries-easy","status":"publish","type":"post","link":"https:\/\/scruss.com\/blog\/2016\/01\/23\/q-absolves-data-sins-and-makes-csv-queries-easy\/","title":{"rendered":"q absolves data sins and makes CSV queries easy"},"content":{"rendered":"<p>The cryptically-named <a href=\"http:\/\/harelba.github.io\/q\/\">q<\/a> (it also bills itself as being able to \u00e2\u20ac\u0153<em>Run SQL directly on CSV files | Text as Data\u00e2\u20ac\u009d<\/em>) is <em>very<\/em> nifty indeed. It allows you to run SQL queries on delimited text files. It seems to support the full <a href=\"https:\/\/sqlite.org\/\">SQLite<\/a> SQL dialect, too.<\/p>\n<p>I used to frequently query the <a href=\"http:\/\/www.ieso.ca\/\">IESO<\/a>&#8216;s <a href=\"http:\/\/www.ieso.ca\/Pages\/Power-Data\/default.aspx\">Hourly Wind Generator Output<\/a> report (which now hides behind a JS link to obscure the source URL, <a href=\"http:\/\/www.ieso.ca\/\/imoweb\/pubs\/marketReports\/download\/HourlyWindFarmGen_20160122.csv\">http:\/\/www.ieso.ca\/\/imoweb\/pubs\/marketReports\/download\/HourlyWindFarmGen_20160122.csv<\/a>).\u00c2\u00a0 Now that the file has nearly 10 years of hourly data and many (but not all) wind projects, it may have outlived its usefulness. But it does allow me to show off some mad SQLite skills \u00e2\u20ac\u00a6<\/p>\n<p>The first problem is that the file uses nasty date formats. Today would be 23-Jan-16 in the report&#8217;s Date field, which is filled with the ugh. You can fix that, though, with a fragment of SQL modified from <a href=\"https:\/\/stackoverflow.com\/questions\/1181123\/date-formatting-from-sqlite-query\/33116186#33116186\">here<\/a>:<\/p>\n<pre>printf(\"%4d-%02d-%02d\", substr(Date, 8,2)+2000, (instr(\"---JanFebMarAprMayJunJulAugSepOctNovDec\", substr(Date, 4,3))-1)\/3, substr(Date, 1, 2)) as isodate<\/pre>\n<p>The above data definition sets the isodate column to be in the familiar and useful YYYY-MM-DD ISO format.<\/p>\n<p>A related example would be to query the whole CSV file for monthly mean generation from Kingsbridge and K2 Wind projects (they&#8217;re next to one another) for months after K2&#8217;s commissioning in March 2015. Here&#8217;s what I did in q:<\/p>\n<pre>q -T -O -H -d, 'select printf(\"%4d-%02d\", substr(Date, 8,2)+2000, (instr(\"---JanFebMarAprMayJunJulAugSepOctNovDec\", substr(Date, 4,3))-1)\/3) as isomonth, avg(KINGSBRIDGE) as kavg, avg(K2WIND) as k2avg from Downloads\/HourlyWindFarmGen_20160122.csv where isomonth&gt;\"2015-03\" group by isomonth'<\/pre>\n<p>which gave the results:<\/p>\n<pre>isomonth\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a0kavg\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a0k2avg\r\n2015-04\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a012.7277777778\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a037.4569444444\r\n2015-05\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a08.94623655914\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a067.6747311828\r\n2015-06\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a06.05833333333\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a066.6847222222\r\n2015-07\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a03.96370967742\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a045.372311828\r\n2015-08\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a06.34811827957\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a067.436827957\r\n2015-09\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a07.29027777778\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a079.7194444444\r\n2015-10\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a014.5658602151\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a0128.037634409\r\n2015-11\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a015.9944444444\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a0130.729166667\r\n2015-12\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a017.6075268817\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a0152.422043011\r\n2016-01\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a019.6408730159\u00c2\u00a0\u00c2\u00a0 \u00c2\u00a0163.013888889<\/pre>\n<p>Neat! (or at least, <em>I<\/em> think so.)<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The cryptically-named q (it also bills itself as being able to \u00e2\u20ac\u0153Run SQL directly on CSV files | Text as Data\u00e2\u20ac\u009d) is very nifty indeed. It allows you to run SQL queries on delimited text files. It seems to support the full SQLite SQL dialect, too. I used to frequently query the IESO&#8216;s Hourly Wind [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"q absolves data sins and makes CSV queries easy #qtextasdata","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[7,8],"tags":[2266,2957,2540,2956,2086,603,59],"class_list":["post-12632","post","type-post","status-publish","format-standard","hentry","category-computers-suck","category-wind-things","tag-csv","tag-ieso","tag-python","tag-qtextasdata","tag-sql","tag-sqlite","tag-wind"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"","jetpack_shortlink":"https:\/\/wp.me\/pQNZZ-3hK","jetpack_sharing_enabled":true,"_links":{"self":[{"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/posts\/12632","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/comments?post=12632"}],"version-history":[{"count":2,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/posts\/12632\/revisions"}],"predecessor-version":[{"id":12634,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/posts\/12632\/revisions\/12634"}],"wp:attachment":[{"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/media?parent=12632"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/categories?post=12632"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/scruss.com\/blog\/wp-json\/wp\/v2\/tags?post=12632"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}