email@example.com • linkedin.com/in/shiv-racharla-624087a4 • github.com/Shiv-Racharla
Big Data Architect
Strong track record of success creating big data solutions for key business initiatives
in alignment with analytics architecture and future state vision.
Seasoned information technology professional skilled in business analysis, business intelligence, data modeling, data architecture, and data warehousing. Proven ability to deliver on organization mission, vision, and priorities. Extensive hands-on experience leading multiple data architecture projects, gathering business requirements, analyzing source systems, and designing data strategies for dimensional, analytical, transactional, and operational data systems.
Machine Learning/Deep Learning
Leadership/Team Training and Support
Xylo Corporation, Neptune, NJ
Architect/Lead, 10/2016 – Present
Clients: Bank of America, CUNA Mutual, Anthem, Inc, Microsoft, Fannie Mae
Performed complex UPSERTS with Kudu, MongoDB, Hudi, DeltaLake for large data volumes derived from various sources; process large-scale electronic medical and financial records, data sets for daily and monthly stored data in Amazon S3, Redshift, HDFS, and Blob Storage. Develop predictive modeling using machine learning algorithms such as linear regression, logistic regression, and decision trees. Develop and design extract, transform, load (ETL) applications using big data technology and automate using Oozie, Control M, Autosys, and shell scripts. Utilize Jenkins, Bamboo, Azure, AWS to continue integration for project and code to build before deployment. Set up and run data ingestion using Streamsets and Nifi for various data formats and sources. Mentor and supervise on-site employees and outsourced/off-site personnel; write and update guidelines and protocols for teams to complete objectives.
Solely built Lambda and Kappa Architecture and solutions for on-premise (cloudera, hortonworks), hybrid(cazena), on-cloud (AWS, Azure); also designed API which connects to MongoDB as source for both on-premise and on-cloud.
Transformed unstructured data into structured data with Apache Spark, utilizing data frame and querying from other data sources to S3, Redshift, Hive, Impala, Kudu, Hudi and MongoDB.
Built Ethereum blockchain and deployed smart contracts in private network; also built application with hyper ledger fabric using hyper ledger composer in Bluemix.
Conceptualized and created models using machine-learning regression techniques.
Built Build and Release pipelines to automate the deployment process for CI/CD and orchestrated using Docker and Kubernetes.
MatlenSilver, Inc, Charlotte, NC
Hadoop Developer, 7/2015 – 10/2015
Client: Bank of America
Led capacity planning of Hadoop clusters based on application requirement. Guided several Hadoop clusters and other services of Hadoop Ecosystem in development and production environments. Contributed to evolving architecture of company services to meet changing requirements for scaling, reliability, performance, manageability, and pricing. Developed, designed, and automated ETL applications utilizing Oozie workflows and shell scripts.
Created sentry policy files for business users in development, user acceptance testing, and production environments to provide access to required databases and tables in Impala; also designed and incorporated security processes, policies, guidelines for accessing cluster.
Converted copybook files from EBCDIC ASCHII, binary formats; stored files in HDFS; created Hive tables to decommission mainframes to make Hadoop primary source for export to mainframes.
Elite IT Solutions, Inc., Springfield, IL
Software Engineer, 5/2013 – 6/2015
Clients: Cerner Corp, Premier Inc., First Data Corporation
Pulled data from Relational Database Management System (RDBMS) such as Teradata, Netezza, Oracle, and MySQL utilizing Sqoop; stored data in Hadoop Distributed File System (HDFS). Utilized shell script to developed and deployed internal tool for comparing RDBMS and Hadoop such that all data located in source and target matched. Created external Hive tables to store and run queries on loaded data.
Architected, implemented, and tested data analytics pipelines with Hortonworks/Cloudera.
Implemented partitioning and bucketing techniques for external tables in Hive, improving space and performance efficiency.
Prior Experience as Java Developer (7/2006 to 12/2010) and as Network Administrator Associate (6/2002 to 5/2006) for Ashvamedh Services, India.
Big Data Technologies:
HDFS, MapReduce, YARN, Hive, Pig, Sqoop, Impala, Oozie, Flume, Zookeeper, Kafka, Nifi, HBase, MongoDB, Stream sets, Talend, Splunk, Kibana, Logstach, Elastic Search, Kudu
Spark, Spark SQL, Spark Streaming, and Spark Mlib
AWS, S3, EBS, EC2, VPC, Redshift, EMR, Azure, Cloud Front, Glue, Athena, Hudi, API Gateway
Machine Learning, Deep Learning, TensorFlow, Scikit, Learn, Sage Maker, Keras, PyTorch
Ethereum, Cardano, R3, Hyper Ledger, Smart Contract
Java, Python, Scala, R, Solidity
Unix Shell scripting, SQL and PL/SQL
Jenkins, Bamboo, Azure, AWS, Docker, Kubernetes, Hem, Astronomer
Oracle, MySQL, SQL Server, Netezza, Teradata
Maven, Eclipse, Pycharm, RStudio, Juypter, Zeppelin, Tableau, GitHub, Bitbucket, Jira, TFS, VSTS, Autosys, Control M, Airflow
Education and Credentials
Master of Science in Technology, Concentration in Computer Technology (2012)
Eastern Illinois University, Charleston, IL
Apache Spark Developer (Databricks)
AWS Solutions Architect Associate (Amazon)
HDP Certified Developer (Hortonworks)
Machine learning (Udacity)
Blockchain Essentials (IBM)
Python for Data Science (IBM)
Docker Essentials (IBM)
Deep Learning (IBM)