Shashwat Khanna
Verified Expert in Engineering
数据科学家和软件开发人员
Shashwat是一位经验丰富的专业人士,在核心数据科学领域拥有近十年的工作经验. He has rich experience designing, developing, 并为整个银行业的客户部署机器学习模型, financial services, insurance, retail, eCommerce, and healthcare sectors. Shashwat目前负责Shopify最近推出的产品的端到端产品分析.
Portfolio
Experience
Availability
Preferred Environment
亚马逊网络服务(AWS), Spark, Windows, Ubuntu, R, Python 3
The most amazing...
...我开发的是一个大型广告关键词收入预测模型,它帮助聚合器实现了20%左右的利润提升.
Work Experience
Data Scientist (Freelance)
Freelancer
- 参与多个跨部门的ML和BI项目. 使用ML模型、高级llm、NLP嵌入等创建多个复杂模型. 并行处理多个客户端.
- Created several POCs. 有付费和开源llm的经验. 目前正在探索将llm用于不同的用例.
- 成功地管理和改进了大规模部署模型. 兼任数据工程师和数据科学家.
- 设计并实现了一个大型管道,用于处理多年的移动数据(每天约500 GB),并使用地理空间情报方法得出推论.
Senior Data Scientist
Shopify
- 处理新产品的产品分析,并负责数据事件仪器, data models, 内部用户的分析仪表板, and user-facing analytics.
- 结合使用PySpark, DBT/SQL和数据可视化工具. 与产品经理等多学科团队密切合作, UI/UX experts, developers, 和高层领导一起制定数据路线图.
- 监督产品的测试和GTM发布, 为内部利益相关者提供关于产品使用和采用的关键见解,并推动产品路线图和优先级.
- 为新推出的Shopify计划定义实验、kpi和护栏指标.
Senior Data Scientist
Clara Analytics
- 领导数据科学家和工程师团队开发产品,重点关注使用基于规则的深度学习和神经网络- rnn的NLP, LSTMs, and autoencoders using Keras.
- 使用Spark和Spark NLP管理组织级NLP堆栈的架构和开发.
- 在为客户创造价值量化方法方面发挥了关键作用.
- 创建了一个文档处理管道,提取关键信息以帮助保险理算员分析病史.
Senior Data Scientist
64 Squares Private Limited
- 领导并完成了20多个行业客户的ML和分析任务,如银行, insurance, retail, and eCommerce and geographies, including the US, UK, and Australia.
- 与从传统统计模型(如线性回归和逻辑)到高级预测模型(如随机森林和梯度增强)的技术密切合作.
- 获得了使用RESTful api生产ML经验的丰富经验, batch processes, and database integration.
Senior Consultant
Deloitte
- 在战略和运营部门工作,重点关注医疗保健和生命科学领域.
- 致力于发展机会/扩张, business plans, impact evaluations, 为各种客户进行可行性研究.
- 与包括州和中央政府在内的各种客户合作, hospital chains, large business conglomerates, bilateral funding, and donor agencies.
Experience
大规模关键词收入预测模型
Towards the end of the project, 与现有的模型相比,客户能够实现20%的提升.
大型医疗记录处理引擎
实体零售店的通用预测模型
Chatbot for a Large B2B Aggregator
Skills
Languages
SQL, Python, Python 3, R
Libraries/APIs
Pandas, REST APIs, Scikit-learn, XGBoost, Spark ML, Google AdWords, PySpark, Flask-RESTful, TensorFlow, Keras
Tools
Git, Google Sheets, BigQuery, Microsoft PowerPoint, Google Analytics
Paradigms
Data Science, Key Performance Metrics, Business Intelligence (BI), ETL, Dimensional Modeling, Kimball Methodology, Distributed Computing
Platforms
RStudio, Jupyter Notebook, Amazon Web Services (AWS), Linux, Ubuntu, Windows, Google Cloud Platform (GCP), Docker
Industry Expertise
Project Management
Storage
MySQL, Data Pipelines
Other
Natural Language Processing (NLP), Predictive Analytics, Exploratory Data Analysis, Forecasting, Data Reporting, Google BigQuery, Product Analytics, Dashboards, Key Performance Indicators (KPIs), Machine Learning, Unstructured Data Analysis, Predictive Modeling, eCommerce, Analytics, Data Analysis, Data Visualization, Reports, Data Analytics, Data Mining, Office 365, APIs, API Integration, Data Extraction, Business Analysis, Statistics, Predictive Learning, Classification, Regression, Data Cleaning, Artificial Intelligence (AI), OpenAI GPT-3 API, OpenAI GPT-4 API, Language Models, nbdev, Version Control, ChatGPT, Team Mentoring, Supervised Learning, Unsupervised Learning, Econometrics, Finance, Gradient Boosting, Random Forests, Chatbots, ETL Tools, A/B Testing, Streamlit, Data Modeling, Big Data, Time Series Analysis, Deep Learning, Product Development, GPT, 生成预训练变压器(GPT), Large Data Sets, 生成预训练变压器3 (GPT-3), Regular Expressions, OpenAI, Data Scraping, Architecture, Integration, Prompt Engineering, Time Series, LSTM Networks, Logistic Regression, Linear Regression, Feature Engineering, Clustering, Pipelines, OCR, Monitoring, Consulting, Financial Modeling, Strategy, Public Health, Public Policy, Marketplaces, Neural Networks, Modeling, Data Engineering, Google Data Studio, Web Marketing, Custom Models, Front-end, Recommendation Systems, Web Scraping
Frameworks
Spark, RStudio Shiny, Flask
Education
Master's Degree in Economics
英迪拉甘地发展研究所-孟买,印度
Bachelor's Degree in Physics
University of Delhi - Delhi, India