I am Surya, a young, energetic professional with nearly 3 years of experience working with data and platform engineering. Having worked remotely and across timezones for the significant part of this year, I'm eager to take on remote work opportunities where I can bring my skills to solve problems.
Bildung
C
Computer Science and Engineering
Vellore Institute of Technology, Bachelor's Degree
2018
Arbeit & Erfahrung
Z
Research Scientist
Zendrive
Jul'2019 - May'2021
Developed low-level Scala APIs and high-level Python APIs to enable large-scale GIS queries with GeoSpark,
using quad-tree partitioning. Achieved >5x faster speed than PostGIS queries on PostgreSQL database. This
has helped scale the enrichment of geospatial features, such as reverse geocoding, segmenting a trip based on
zones, etc on millions of trips. Geo data is a primary feature in insurance scoring.
- Leveraged the geo-platform above to refactor and scale an existent GeoPandas pipeline to predict stop-signs
on roads in the USA based on GPS trails near road intersections.
- Scraped geographic information like boundaries and roads of the entire world from OpenStreetMaps, converted
them into Scala-compatible formats, and designed a hierarchical storage to enable the geoplatform APIs.
- Automated large dataset generation and validation tasks - processing '00s of millions of rows - on Airflow.
- Migrating in-house libraries to be Python 3 compatible to enable using newer machine-learning frameworks.
S
MEAN stack Development Intern
SIBIA Analytics
Dec'2016 - Jan'2017
Integrated a front-end dashboard with NoSQL database using Javascript to display a variety of analytical metrics
in near real-time for a popular news-provider in West Bengal.
SELECT PROJECTS
Visualizing the schema of complex Python objects for easy analysis of their data-structures. Uploaded on
PyPI (package name: print-schema) with 1.5K downloads (as of Mar '20).
Created a reusable pipeline which fetches top posts from Reddit, creates images from texts, then uploads them on an Instagram page in a completely automated way through APIs. Blogpost featured on Better Programming.
Quantifying biases (or the lack thereof) of news sources on Twitter by fetching tweets, filtering them, and per-
forming sentiment analysis, implemented with Python and VADER.