Data Engineer
Engineering
hybrid: Mountain View
added Wed Sep 27, 2023
Apply to Kumo
The global data management software market is set to reach $137.6 billion by 2026, and we're on a mission to make a significant impact. We're seeking intellectually curious and highly motivated Data Engineers to become foundational members of our Machine Learning and Data Platform team.
Required Qualifications for Ideal candidate
- 4+ years of professional experience in SaaS/Enterprise companies
- Strong experience with data ingestion and connectors
- Experience in building end-to-end production-grade data solutions on AWS or GCP
- Experience in building scalable ETL pipelines.
- Ability to plan effective data storage, security, sharing, and publishing within an organization.
- Experience in developing batch ingestion and data transformation routines using ETL tools.
- Familiarity with AWS services such as S3, Kinesis, EMR, Lambda, Athena, Glue, IAM, RDS.
- Proficiency in several programming languages (Python, Scala, Java).
- Familiarity with orchestration tools such as Temporal, Airflow, Luigi, etc.
- Self-starter, motivated, with the ability to structure complex problems and develop solutions.
- Excellent communication skills and ability to explain data and analytics strengths and weaknesses to both technical and senior business stakeholders.
Preferred Qualifications - good to have
- Deep familiarity with Spark and/or Hive
- Understanding of different storage formats like Parquet, Avro, Arrow, and JSON and when to use each
- Understanding of schema designs like normalization vs. denormalization.
- Proficiency in Kubernetes, and Terraform.
- Azure, ADF and/or Databricks skills
- Experience with integrating, transforming, and consolidating data from various data systems into analytics solutions
- Good understanding of databases, SQL, ETL tools/techniques, data profiling and modeling
- Strong communications skills and client engagement
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
The creation of the data warehouse emerged to solve the analytics problem over large amounts of data. Now, we’ve moved from megabytes to gigabytes to terabytes of data storage with no end in sight and companies invest millions of dollars to store and organize that data and only leverage a fraction of it for machine learning.
With Kumo, we are building the first data platform to seamlessly allow machine learning over data warehouses for faster, simpler, and smarter predictions to combat data waste and maximize data value. Query the future with Kumo.