Getting started with Julia is pretty straightforward, especially when you are familiar with Python. For this walk-through we will be using data on Covid-19 as provided by the Center for Systems Science and Engineering at Johns Hopkins University in their GitHub repository.
For our data analysis we will be using just a few packages to keep thing simple: CSV, DataFrames, Dates and Plots. Simply type the statement using followed by the name of the package and you are ready to go.
In case packages are not yet added to your project environment, you can added them easily.
Sometimes you discover powerful python packages that boost your data analytics or data science workflow you wish you knew before. In this story I share three of my favorites.
While the pandas
df.describe() method is great for exploratory data analysis, you most likely would prefer a deeper understanding of your data. Here the pandas profiling package comes in handy! You can install using the pip package manager by running:
pip install pandas-profiling[notebook]
For our example we will use the Iris dataset and start with describing the data using the pandas
When using pandas in you data science or data analytics projects, you sometimes discover powerful new functions you wish you knew before. Here is my personal top 5.
Pandas has a powerful method
read_html() for scraping data tables from webpages.
Let’s assume we need data on gross national income. It is available in a data table on Wikipedia.
With a little help of an easy to use Python script…
The topics your Medium story was curated into are easy to find when you know where to look. Zulie Rane wrote an excellent story¹ explaining the steps to find these topics manually. In this story I describe how automate these steps.
We will use the programming language Python to make a little script to automagically return the topics your story was curated into. So make sure you have Python installed on your laptop. If it isn’t preinstalled you can download and install it from python.org.
The script to automate…
Pluto is a lightweight and easy to use reactive notebook for the Julia language. In this story I will share my experience with Pluto, especially the five features I love most.
Getting started with Pluto is easy. You just add the
Pluto package to your project environment and you are good to go. Start Pluto by typing
Pluto.run() from the Julia REPL and Pluto opens in your default web browser.
In my workflow I usually navigate to my project directory and start Pluto with the command:
julia --threads auto --project=. -e "using Pluto; Pluto.run()"
There a many thing to love…
For this walk-through we will use shapefiles and data published as open data by Statistics Netherlands and the National Institute for Public Health and the Environment. We are plotting a thematic map with daily Covid-19 cases using the
First we will have to load a few packages.
When needed, add missing packages to your Julia project environment. For a step-by-step instruction, see my previous story.
To keep things organized, we will create some directories in our project environment.
For mapping actual values to colors, we need to normalize those values. …
When you start to code multiple projects in Julia, it is recommended to use project specific environments for reproducibility and minimizing package dependencies. Julia has a great built-in package manager to make things easy. In this story I share my workflow step-by-step.
Pkg is Julia’s built-in package manager and handles operations such as adding, updating and removing packages. Pkg has it’s own read — evaluate — print — loop (REPL). In my workflow, when I want to create a new project environment, I usually start Julia from the directory where I keep my coding projects. …
Senior Information Manager with a passion for all things data. Official author of Towards Data Science (TDS).