Location: Durham, NC
Duration: 6 Months
- Looking for a Principal Data Engineer who will work closely with data scientists and other stakeholders to design and maintain highly complex data models.
- The Principal Data Engineer is responsible for developing and supporting advanced and complex reports that provide accurate and timely data for internal and external clients.
- The Principal Data Engineer will lead the design and growth of a data infrastructure that powers our ability to make timely, actionable and data-driven decisions.
- Serve as the principal lead to identify, evaluate and execute the development and implementation of data infrastructure:
- Provide strategic thought leadership and partner with other functional areas to deliver collaborative work products that align with divisional and enterprise strategy
- Develop highly complex SQL queries to extract data for analysis and model construction
- Consult with business and technical staff to understand business need and provide thought leadership on solutions during requirements gathering sessions
- Support critical strategic initiatives and architect highly complex data engineering projects
- Own delivery of multiple large, complex data engineering projects simultaneously
- Design and develop scalable, efficient data pipeline processes to handle data ingestion, cleansing, transformation, integration, and validation required to provide access to prepared data sets to analysts and data scientists
- Ensure performance and reliability of data processes
- Document and test data processes including performance of through data validation and verification
- Collaborate with cross functional team to resolve data quality and operational issues and ensure timely delivery of products
- Develop and implement scripts for database and data process maintenance, monitoring, and performance tuning
- Analyze and evaluate databases in order to identify and recommend improvements and optimization
- Design advance eye-catching visualizations to convey information to users
- Serve as a mentor and coach to more junior team members regarding various skills and approaches and may oversee work of other staff members as needed.
- Bachelor’s degree in Computer Science or other STEM related major
- 7 years of the following experiences:
- Experience with Oracle, Data Warehouses and Data Lakes
- Experience on Big Data platforms such as Hadoop, Spark, HBase, CouchDB, Hive, Pig etc.
- Programming experience in Python, R or other related language
- If no degree, 9 years of the experiences listed above
- If Master’s degree in STEM or related major, 5 years of experiences as listed above
- Advanced SQL programming skills
- Expert level experience working with large and complex data sets
- Advanced experience leading large, complex business analysis projects and/or technical projects
- Advanced experience with reporting and business intelligence tools such as Tableau, Informatica, TeraData, MySQL, Qlikview, Oracle etc.
- Master’s Degree in STEM or related major
- Experience with Hadoop, Hive and/or other Big Data technologies
- Experience with ETL or Data Pipeline tools
- Experience with query and process optimization
- Experience working in AWS and/or using Linux based systems
- Prior project management experience with Agile methodology
- Ability to translate task/business requirements into written technical requirements
- Reliable task estimation skills
- Excellent quantitative, problem solving and analytic skills
- Ability to document data pipeline architecture and design
- Ability to collaborate effectively with business stakeholders, performance consultants, data scientists, and other data engineers
- Proficient in use of MS Office applications including expert level Excel programming
- Ability to quickly become an expert in operational processes and data of lines of business
- Ability to troubleshoot and document findings and recommendations
- Ability to communicate risks, problems, and updates to leadership
- Ability to keep up with a rapidly evolving technology space
- Experience With ETL Tools In Past & Current experience with any Cloudera (Stack)