Post Job Free

Resume

Sign in

Data Analyst Engineer

Location:
Houston, TX
Posted:
February 06, 2024

Contact this candidate

Resume:

Name: Sri Harsha A Email: ad3e3p@r.postjobfree.com

Phone: 352-***-****

Professional Summary:

●Data Literate and Insights-oriented developer with over 4 years of experience as Data Engineer, Data Analyst & BI Developer, this enabled me craft innovative solutions to stakeholders in almost real-time interval.

●Active participation in decision making, QA meetings and regularly interacted with the Business Analysts and SMEs to understand the Business Process, Requirements & Designs.

●Proficient in handling and ingesting terabytes of Streaming data (Kafka, Spark streaming, Strom), Batch Data, Automation.

●Collaborate with cross-functional teams such as business, engineering teams regularly for diving deep on data, effective decision making and to support analytics platforms.

TECHNICAL SKILLS:

Big Data Technologies: HDFS, Yarn, MapReduce, Apache Spark, Scala, Impala, Kafka,Apache Hadoop, Cloudera CDP.

Tools: Salesforce, Databricks, Dataflow, Power BI, Looker, Qlik View, Tableau, Hive, Spark, HBase, Kafka, PySpark, Spark SQL, SQOOP, ADF, StreamSets, Airflow, Docker, Kubernetes, Prometheus, Grafana,Stream Analytics, Azure DevOps, GitHub, Azure, GCP, AWS, Pandas, Pytorch, IntelliJ, Eclipse, Git, Maven, Android Studio, Logic Apps, Power Automate.

Programming Languages: C,SQL, HiveQL, Scala, Python, Java,Unix Shell Scripting.

Databases: MS-SQL SERVER, Oracle, MS-Access, MySQL, Teradata, PostgreSQL, DB2, Cassandra, MongoDB.

Reporting Tools/ETL Tools: SSIS, SSRS, SSAS, Tableau, Power BI.

Professional Experience:

Dish Network BI Developer Jan 2023 – Present

●Implemented ETL process to regulate energy bidding process by generating and publishing Locational marginal pricing files (LMP) used to establish price for bidding based on location and congestion.

●Developed SSRS reports consuming historical load data-to generate precise load curves used for energy forecasting by government and non-government organization.

●Developed ETL and SSRS dashboard for weather forecast utilizing weather data API, the dashboard data is utilized by Weather forecasting agencies.

●Implemented ETL process to manipulate same source of data into different set of outputs for Markets and Planning which significantly reduced project cost.

●Documented data extraction workflows and best practices, reducing onboarding time for new team members by 20%.

●Improved ETL (Extract, Transform, Load) processes, reducing data extraction time by 15% and enhancing data quality.

●Leveraged Python and data mining techniques to identify hidden patterns, resulting in a 15% increase in product efficiency.

●Conducted pricing analysis that resulted in a 5% increase in profit margins by optimizing pricing strategies based on market dynamics.

●Implemented Matrix.report to reduce number of pages from 300 to 80 which improved performance and increased number of users accessing the report

●Involved in data analysis and handling the ad-hoc requests by interacting with business analysts, clients and customers and resolve the issues as part of production support.

●Designed and maintained interactive dashboards that tracked key payment KPIs, resulting in a 30% increase in the accuracy and speed of decision-making processes.

●Created monthly performance reports with visualizations that were shared with executive stakeholders, leading to a 15% improvement in executive satisfaction with reporting.

●Moved bulk data load from hundreds of thousands of MySQL databases to a streaming process so business processes previously using data hours to days out of date to data updated every 5 minutes.

●Developed complex data cleaning calculations, as DST, Time Zone conversions, Day exclusions, using SQL.

●Built ETL pipelines by integrating data from 100+ files to load into Enterprise Data Warehouse (EDW) using SSIS.

●Managed 500 GB data warehouse for the Information Systems department which included financial, project tracking, talent acquisition, and HR data.

●Worked on migrating data from AWS Redshift to properly partitioned dataset on AWS S3. This enabled the use of AWS Redshift Spectrum reducing warehouse cost by about 80%.

●Automated business metrics using python script, hive SQL, and oozie scheduler to generate tableau dashboards.

●Built visually compelling, comprehensive trends, and 50+ KPIs in domain for 363 products across 120 business operation centers.

●Conducted both repeatable and ad-hoc analyses to track and interpret trends in user behavior and product usage, influencing feature prioritization and design decisions.

●Utilized statistical modeling to conduct A/B tests, resulting in a 10% increase in user engagement.

Infosys, India Data Engineer June 2018 – July 2021

●Designed and developed a Python Parser to auto-convert HiveQL codes into equivalent PySpark (Spark SQL) jobs to leverage the Spark capabilities on AWS EMR, thus reducing conversion time by over 90%.

●Utilized Kafka pub-sub model for tacking real-time events in the data records to trigger processes for data orchestration.

●Implemented automated data pipelines to ensure the reliability and accuracy of metrics, reducing manual data handling by 20%.Designed and built a data pipeline to consolidate similar products without using unique ids, using Word2Vec, Spark, Snowflake, and Airflow. This enabled bidding on additional products per auction leading to an increase in spending of 9%.

●Automated advanced SQL queries and ETL techniques using Apache Airflow to reduce boring weekly administration tasks by over 50%.

●Worked on creating and modifying SQL stored procedures, functions, views, indexes, and triggers.

●Applied data mining to shipping consolidation problem which resulted in $2.5 million savings over 2022 for single-day shipping consolidation.

●Implemented several DAX functions for various fact calculations for efficient data visualization in Power BI and optimized the DAX queries.

●Generated Monthly Analysis Report extraction creation with PYTHON, Power BI, SQL, MS Excel.

Education Details:

Masters in data science from University of Houston (Graduated in December 2022)

Academic Projects:

Facial reconstruction of masked images:

●we implemented a method that takes advantage of the combination of deep learning and Local Binary Pattern (LBP) features to recognize the masked face. We applied pre-trained deep Convolutional neural networks (CNN) to extract the best features from the obtained regions (mostly eyes and forehead regions). we have achieved a Train accuracy of 95.8 percent and Test accuracy of 94.5 percent.

Phishing Detection using Web URL (Summer 2021):

●To detect the fake or phishing websites who are trying to get access to the sensitive data or by creating the fake websites and trying to get access of the user personal credentials. Acquired dataset by using web-scrapping as the spammers keep changing the URLs and the dataset is dynamic this is implemented using Random Forests algorithm, designed system enjoys the benefit of both high detection power and low error rates and achieved.



Contact this candidate