Advanced level proficiency in Python programming especially in conceptual understanding of relational databases and hands-on comfort working within a Linux computing environment, and use of numerical/analytical computing using Numpy, Scipy, and Pandas.
Experience in writing SQL based ETL procedures.
Hands on experience in Hadoop ecosystem (HDFS, YARN/Mapreduce, Hive, HBase)
Command line tools such as grep / sed / awk etc. for automating common tasks.
Good problem solving skills, willingness and ability to quickly learn new technologies, and the confidence to hack the way out of tight corners.
Working knowledge of web application frameworks in Python such as understanding of basic statistical and analytical techniques, especially Django, Flask etc.
Hypothesis testing.
Operating knowledge of the AWS cloud computing platform.
Experience of working with version control software; preferably Git.