BUNYAMIN DURSUN
Data Engineer & Data Architect & Data Strategist
+44-772*-***-***, **************.**@*****.***, London
Profile Summary: 22 years experience on: creating data strategies and roadmaps, data project management, data architecture, data analytics, data engineering, data quality analysis, database design, data visualisations, web scrapping, web crawling, data integrations, data system testing, text analytics, fuzzy string matching, NLP, synthetic data generation, affiliate marketing platforms, processing unstructured and structured data, advanced SQL and Python. Robotic Process Automation, Strong research and R&D background have experience on innovation and developing Proof of Concept (POC) projects. Participating Architectural Review Boards, Team level and department level management experience.
TECHNICAL SKILLS
●Data Architecture: Power Designer, ARIS, Erwin, Embarcadero, Visio, TOGAF, Lucidchart
●Programming Languages: Python, Java, C#.Net, SQL, PowerShell, Shell script, JS, Google Apps Script
●Python Libraries: Pandas, NumPy, NLTK, Matplotlib, Scikit-learn, TensorFlow, Scrapy, BeautifulSoup, Gensim, Word2Vector, OpenNLP, Seaborn, Selenium, SpaCy, Flask, GeoPandas, Prodigy, JS, Dash, Plotly, SqlAlchemy, AWS Lambda, Alembic, PyArmour, Logging, Playwright, GraphFrames, NetworkX
●Databases: MS SQL Server, MySQL, PostgreSQL, Oracle, MongoDB, Firebase, Hadoop, Hive, SPARQL, OWL, Neo4J, NoSQL, Graph Databases, RDF Databases, BigQuery, SQLite, IBM DB2, Snowflake
●Data Analysis Conceptual: Data analysis, Data mining, Machine learning, Data quality, Fuzzy record matching, NLP, NLG, Automated insight generation, Data model validation, Fraud detection, Customer segmentation, Data architecture, Data modelling, ETL, Data warehouse, Semantic web, Ontologies, Data visualisation, Metadata management, Data migration, Data Anonymization, Chatbots, Data Automation, Automated Fuzzy Schema and Data Catalogue Matching
●Data Engineering and ETL: Databricks, AWS Glue, PySpark, Airflow, AWS, GCP, Docker, Delta Live Tables, Delta Tables, Vmware, IBM Cloud, IBM Watson Discovery, IBM Global Names Management
●Data Analysis Tools: Tableau, Azure Data Studio, Azure Data Factory, Looker Data Studio, Excel, MS Power BI, MS SSIS, Oracle Data Miner, Informatica Data Quality, Weka, Rapid Miner, Trillium, Lucene, MS Reporting Services, Alteryx, Databricks SQL Analytics, Elasticsearch ELK, IBM Cognos Analytics
●PM Tools: Atlassian Jira, Confluence, Trello, Asana, Azure Devops, Agile, Bitbucket, Gitlab, GitHub, CI/CD
●API Development and Message Queue: Flask API, GraphQL, Swagger, Postman, Selenium, Jmeter, Creating Data Access Layers, Kafka, RabbitMQ
●Robotic Process Automation (RPA): Blueprism RPA, Power Automate, IBM RPA
●Information Security: ISO 27001 GDPR, SANS GIAC Security Essentials (GSEC)
PROFESSIONAL EXPERIENCE
May 2024 – August Lead Data Engineer BT (British Telecom)
●Leading the Data Engineer team, refactoring and modernizing the legacy data pipelines.
●Modernizing Data Quality and Availability Framework of marketing data. Creating data lineage automations. Improvements and enhancements on NBA (Next Best Action) Framework.
●Reducing the need for ad-hoc report development. Improving the quality of the base insights through centralization, visualisation and data. Improving latency and reliability for core trading reportings.
●GCP, BigQuery, Collibra, Terraform, SQL, Python, Google Composer, Airflow, Jira, Confluence, Gitlab, Adobe Analytics, Google Analytics, Paid media Audience data intelligence, GCP Dataplex, CI/CD
Sep 2021 – Apr 2024 Senior Data Engineer / Data Architect Next
●Data architecture, data engineering works for external partner clients’ data in Next retail platform. Onboarding retail platform clients into modern AWS Databricks environment.
●Creating common data models and building data pipelines and data transformations. Data quality analysis and data validations and data lineage. Data catalog discovery and data catalog management works.
●Working in cross teams and multi-company projects. Creating Proof of Concept projects for testing the new approaches or technologies.
●Creating automations for paid media ads platforms and audience management in Databricks (Google Ads API, Facebook Audience API, Pinterest Audience API, Google Analytics, GA360)
●Creating third party data feed integrations for ads and affiliate marketing platforms in Databricks (Conversant, Partnerize, Emarsys, Exponea, Yocuda, iGuide, Splash, Monetate)
●Creating network and graph representation and analytics of user behavioral data on web site. Real time customer content personalisation, Intelligent Recommendations, improving search relevancy and recall.
●Creating product search engines (free text searches and navigated searches), product selections, product recommendations, product rankings, personalization on Next website to increase the Hit Rate and Add to Basket and conversion rates in visits.
●Developing large scale automated web crawlers for international market competitor price analytics.
●Databricks, Python, Spark, SQL, MySQL, Rest API, Web services, sFTP, GCP, BigQuery, AWS, S3, Google Data Studio, ML, NLTK, Word2Vect, Gensim, FuzzyWuzzy, Networkx, GraphFrames, MS SQL Server, Google Discovery Solutions for Retail, Elasticsearch, Databricks Accelerators, MLOPs, Model Serving Endpoint, Algolia, semantic search engine, web scraping, data architecture, data modelling, Bloomreach, Azure Data Lake, Azure DevOps, Word Embeddings, large language models, Selenium, Beautifulsoup, Neo4J.
Oct 2020 – Aug 2021 Data Architect / Data Engineer PerkBox
●Migrating legacy data pipelines into modern, cloud-based data Databricks environment. Building a data platform enables business teams to provide real time information to key stakeholders. Data democratising, Automated pipeline validation and testing, data quality analysis, customer segmentation. Part of data architect team reviewing and modernising the legacy data platforms to emergent technologies.
●AWS, Redshift, DynamoDB, MySQL, PostgreSQL, DataBricks, AWS Glue, Airflow, S3, Gitlab, CI/CD, API, AWS CLI, Rest & GraphQL, RDS, Aurora DB, Salesforce, Tableau, Python, SQL, PySpark, Data Quality, SQL Analytics, NetSuite, Databricks ML, Jira, Confluence
Dec 2019 - Mar 2020 Data Insights Manager Upside Hedge Fund
●Managing the data platform for Automated Insight Generation with NLG.
●Automated data extraction from analyst forecast pdf reports.
●Validating indexes for the forecasting performance of the equity stock market analysts and hedge fund portfolio managers. Designing new metrics and creating new indexes for buy sell decision makers. Creating analyst-wise, team-wise, firm-wise performance benchmarks. (Python, NLP, NLG, Tableau, PdfMiner)
Aug 2019 - Dec 2019 Data Consultant / Data Engineer Mental Health Innovations
●Worked as a data specialist for automating the data anonymization, data scrubbing process full cycle on AWS Cloud. Creating data analytics and reports. (Python, NLP, AWS, Redshift, S3)
●Leading a collaborated research data project by Mental Health Innovation organisation and Imperial College about Data anonymization, data scrubbing, measuring the data anonymization quality on structured and unstructured data sets.
●Building data governance model and enabling privacy preserved machine learning on a very sensitive data set contains private personal information.
●Data privacy and security, GDPR, Data governance, Python, NLP, Spacy, NLTK, Scrubadub, TF-IDF
Feb 2019 - Aug 2019 Data Project Manager FCA (Financial Conduct Authority)
●Project Manager of the Sanctions Screening Synthetic Data Generation Project.
●Building Sanctions Screening Toolkit Proof of Concept Project at Advanced Analytics and RegTech Team.
●Financial crime network data analysis and modelling, Fuzzy string matching and synthetic data generation for sanctions screening. Make use of synthetic data generation for GDPR and data privacy concerns.
●Object-oriented Python, Rest APIs, NLTK, Matplotlib, Seaborn, NumPy, AWS, AWS CodeCommit, AWS Cloud9, Tableau, PostgreSql, Sphinx, PyCharm, Gensim, Request
Oct 2016 - Jan 2019 Data Test Lead LexisNexis
●Preparing Gold Standard test datasets to test the financial screening / sanctions screening software. Creating a platform for measuring test data coverage and test data quality assurance. Leading data oriented acceptance tests and test validations, data dictionaries, data catalogs.
●Synthetic test data generations for testing AML and Sanctions screening softwares and Payment systems.
●Establishing test data quality measurement framework and measuring the performance of Watch List Screening Systems and accuracy of AML Tools. Executing data tests, Data quality, hygiene, completeness checks of test data sets. Data driven proofing the traceability and predictability of matching engine software outputs. Text analytics on multi language or multi-script texts.
●Data governance, Web services, SoapUI, XML, JSON, Python, Statistical analysis, MongoDB, Postman, Java, Jira, PayPal transaction data format
Sep 2016 – Dec 2021 Data Consultant InfoMerge
●Creating data strategy and roadmaps, Participating Architectural Review Boards in different organisations.
●Research on emergency technologies, running benchmarking for alternative solutions, Reviewing the as-is state and guiding to-be architectures and infrastructures. Consultancy on data migration projects.
●Social media automation platform for e-commerce sellers. Automated content generation for Affiliate marketing, Telegram and Twitter, Slack, Instagram, Pinterest and Facebook channels.
●Data analytics on Algorithmic Trading, Thomson Reuters API, Binance API, sec.gov datasets
●Automated data collection and preparation services for research projects of the Academic people, Academic departments, and research groups.
●Large scale web scraping and data analytics projects on academic databases: Google Scholar, Web of Science, Scopus, Econlit, academic journal databases, publication and bibliographic databases, conference hosting systems, academic performance calculations
●Commercial data collection and automations for several companies including insurance and short term accommodating companies
●Python, NumPy, NLTK, Selenium, Pandas, Matplotlib, Jupyter, PySpark, SpaCy, Flask, Anaconda, Scipy, Java, JSoup, Crawler4j, JSON, Google Analytics, Google Data Studio, GCP, BigQuery, MySQL, Telegram API, Slack API, Twitter API, Facebook API, Fuzzy record matching, Fuzzy string matching, Blue Prism RPA
Mar 2014 – Aug 2016 Lead Data Architect A-techSyn + Alfa Technologies
●Data architect at an IT transformation project for an energy company.
●Designing spatial databases and algorithms for an R&D project about Moving objects and Trajectory database, Flight data analysis for UAV vehicles, creating shortest path algorithms
●Python, NumPy, Matplotlib, GIS, Web Services, MS SQL Server, T-SQL, data quality, vehicle tracking systems, BI Reports, SAP HANA, SAP BO Reporting, Java, Hadoop
Feb 2010 – Mar 2014 Lead Data Specialist TUBITAK
●Data analyst on security analytics, computer aided auditing, financial auditing and risk scoring projects. Creating ETL, data integrations, data warehouse and BI reports. Working on streaming data sources for real-time decision makings.
●Data analytics on medical databases and medical articles, PUBMED, WHO Cancer Classifications, MeSH Tree, pathology reports and medical bibliographic databases
●Clustering, influence groups detection, behavioural analysis and local event detection on Twitter, Sentiment analysis, semantic, spatial analysis of tweets
●ETL, ODI, MS SQL Server, MS SSIS, Oracle, Spatial, Orange, Rapid Miner, Java, C#.Net, Text similarity, NLP, PLSQL, Data Analytics, Twitter4j
Dec 2008 – Feb 2010 Data Architect Team Lead Turkish Telecom
●Data modelling, data architecture, 360 degree customer view design. MDM, data governance, Customer unification, Customer segmentation. Fuzzy matching on customer records. Data quality measurements, creating strategic database reports and dashboards. Designing architecture for external data source integrations, solving data inconsistency issues, leading Data Quality Programme of Turk Telekom and contributing Data Quality strategy and roadmap
●Oracle SOA Suite, Sybase Power Designer, ARIS, PL/SQL, Oracle, Web services, SOAP, Rule engines, Text similarity, Database merge, GIS databases, C#.Net
Dec 2002 – Dec 2008 Data Engineer BankSoft / Havelsan
●Designing and implementing archive databases and ETL pipelines, database administration, database performance tuning, data analysis, database programming, writing advanced stored procedures.
●SSIS, DTS, SQL, PL/SQL, T-SQL, Oracle, MS SQL, MySQL, C#.Net, Infragistics, Embarcadero, Erwin, MS Analysis Services, dtSearch, Lucene, Oracle Text, Oracle Data Miner, Data Analytics, NLP, SAP ABAP
EDUCATION AND TRAININGS
●Ph.D. YBU Computer Science Department, 2011-2014
●M.Sc. Degree in Computer Engineering, Yildiz Technical University, 2008
●B.Sc. Degree in Computer Engineering, Istanbul Technical University, 2002