vrijdag 16 juni 2017

TABLEAU PUBLIC OPENS NEW DATA AREAS

Working with the new version of Tableau Public 10.3 makes working with data a lot easier.  Here are some of the most important improvements:
  1. Pdf are always a pain in the rear and cracking the file can sometimes be hard, using one of the web services like Pdf to Excel, or using Tabula. Now Tableau is able to open pdf's and connect them immediately to a worksheet.
  2. If you don't want to work with Excel, there is always Google sheets. But getting the sheets into Tableau, exporting to an .xls format was needed. Now we can import Google sheets directly in Tableau.
  3. Excel has its limitations for statistical analysis. R has much more tools under the hood, but making visualization is limited, especially for online. Starting Tableau 10.3 .Rdata can directly imported into the worksheets of Tableau.
  4. Making maps with Tableau had important limitations because one had to rely on the maps provided bu Tableau. My solutions was to produce the map in QGIS and export the map to Google FT(Fusion Tables). And here it is: Tableau reads the shape files(.shp) and makes beautiful maps. Adding data to the map is now problem: choose of 4 different database joins between your map and your data.

zaterdag 10 juni 2017

IT IS TIME DATA JOURNALISTS LEARN TO CODE



 Since the beginnings of data journalism in the nineties of the last century, then called CARR or Computer Assisted Research and Reporting, techniques for  analyzing and visualizing data have improved enormously. One of the central tools in te nineties was the spreadsheet, standardized by Microsoft Excel. Spreadsheets are still much used for analysis though moving into the area of advanced data journalism: using for example R for deeper statistical analysis or D3 for creating better interactive graphics creates various new challenges. Then you often will engage in different types of coding: I got struck between Python (for R) or JavaScript (for D3). Does a data journalists need to learn all these programming languages or is there an easier and faster solution?
Looking at journalism practice the answer is:  step on the steep learning  curve and start with learning how to code. Here is some help. Paul Bradshaw starts next year an MA in Data Journalism at the Birmingham School of Media. Studying   Coding and computational thinking being applied journalistic ally (I cover using JavaScript, R, and Python, command line, SQL and Regex to pursue stories)” is one of the elements of this new MA, writes Bradshaw on his blog.
Looking into the market, there is really demand for data journalist with coding skills. Here is a job listing from the Economist. One of the preferred qualities include: A good understanding of data analytics and Coding skills (JavaScript and Python), or a background in data journalism, are a plus.
In the following I will argue that a basic understanding of coding is very helpful, but new services on the web help data journalists to avoid getting stuck in coding.

dinsdag 16 mei 2017

INKSCAPE for DATA JOURNALISM

Playing around with scalable vector graphics (.svg), that is text files describing shapes like cubes, rectangles etc. Important for creating charts using D3, data driven documents. For manipulating .svg I use Inkscape, an open source alterbative for Adobe Illustrator. Available for Windows, Mac and Linux at: https://inkscape.org/en/

dinsdag 9 mei 2017

BOOTING UBUNTU 16.04 FROM 32GB USB PERSISTENT


 

Bought a 32 GB USB drive….that is a complete hard drive! In stead of dual booting Linux from the hard drive, you can boot from the USB in persistent mode, and saving your work and settings. In reading/writing or booting speed is not really different from dual booting. This Philips USB 3.0 Circle 32 GB (compatible with USB 2.0) reads at 55MB/s and writes 10Mb/s. In order to speed up you can run the whole thing in RAM.

UEFI

Booting from USB is with UEFI a bit more difficult. Generally Ubuntu live start from UEFI with secure boot on. However installing a module in the kernel that is not signed (for example bcmwl-kernel-source, for wireless) is blocked. Let’s disable secure boot. Perfect working, however if you want to start a persistent live Ubuntu then you need to start from UEFI with CSM, because the Ubuntu .iso is a hybrid, and Compatibility Support Module (CSM) provides legacy BIOS compatibility. 

So start up the system with: disable secure boot and fast boot, and enable UEFI with CSM.

 

MKUSB

To make the USB booting persistent use mkusb. Choose the image, select the drive, tag UEFI and persistent. However when working on HP, mine is HP Elite 820, choose MSDOS partition table not the GUID partition table (GPT), which is standard for UEFI, because HP likes MSDOS better, it seems.

Now I am not only booting into Ubuntu from a portable hard drive, which I can carry along and use it on any machine.

woensdag 12 april 2017

My bot ALEXA: from news-update to reading mail

Amazons software for bot Alexa is ported to Raspberry Pi. It is a nice experience asking for the news (BBC or NYTimes), listen to the emails I just received.

Installing is a piece of cake.
Here is the recipe: How to Build Your Own Amazon Echo with a Raspberry Pi .
Two problems after installing:
- I cannot get the wakewordagent working; it means that instead of yelling ALEXA... I now have to push the button 'listen' to start my question.
- of course you want to use your local settings; here is howto: Using outside the US . 
  The time zone can be set, but not the location; it does accept only US, GB and DE zip codes. I am stick with Dusseldorf weather.

 

 

dinsdag 3 januari 2017

The Electronic Barometer



This goes beyond data journalism. It is a project about the Internet of Things, studying the relationship between measurement/observation, data and storing data, and finally retrieving, analyzing and visualizing data if possible real time. Here is the story.

maandag 2 januari 2017

THE ELECTRONIC BAROMETER 3



In part 1 we discussed publishing real time data, in part 2 I highlighted storing data in a database and publishing them on a blog. In the last part 3 of this series I will pay attention to retrieving data from the database and visualizing the query results.
The simplest way to retrieve data from MYSQL is installing phpMyAdmin, which gives you complete control of the database, tables and queries. phpMyAdmin makes building a query and exporting it to .csv very easy. The exported .csv can be used for further analysis with for example Excel and visualizing with Google.
More interesting is to make a direct connection to the database from R, making the query, analyze and visualize directly from R. Finally, I pay attention to publishing these results online with plot.ly rest API.