location: Remote
Work Schedule: Full Time
Salary: $94,016.00/Annum
Job Duties:
- Create snowflake tables with clustering on the date for optimizing the performance of ad-hoc queries
- Involve in writing snowflake SQL scripts on consumer data analysis to meet business
requirements. - Export data from Azure Data Lake storage (ADLS) to SNOWFLAKE using databricks notebooks and create snowflake merge scripts to merge data between stage and target table.
- Involve in creating Snowflake tables from wide range of data formats like csv, json, parquet,
delta. - Create Non-Materialized and Materialized views in snowflake based on frequency of data
change and query complexity. - Transform the semi-structured log data to fit into the schema of the Snowflake tables using
pySpark. - Involve in scheduling Data bricks Pyspark Jobs from stars schedular.
- ETL Data Cleansing, Integration and Transformation using Pyspark.
- Export the analyzed data to the curation databases using Pyspark and snowflake scripts for
visualization and to generate reports for the Business Team. - Design a data warehouse in snowflake using tables and materialized views in each Data Product.
- Create and maintain technical documentation for creating, scheduling databricks jobs and
creating job clusters. - Use PySpark to analyze Structured data from multiple Sources.
- Created Databricks jobs to creating Audit reports on data uploaded to ADLS.
- Resolve data issues by identifying root cause and implement data fixes.
- Monitor the system performance by performing regular tests, troubleshooting, and integrating new features.
- Assess database implementation procedures to ensure they comply with internal and external regulations.
Minimum Education Requirement: This position requires minimum of bachelor’s degree in computer science, computer information systems, information technology, or a combination of education and experience equating to the U.S. equivalent of a bachelor’s degree in one of the aforementioned subjects.