Global Fintech company
- Design Inference infrastructure - online, batch and streaming
- Implement offline inference solution
Technology stack: AWS, Sagemaker, Python, Kuberentes, Kafka
Israeli unicorn startup - Deep Learning for Computer vision
- Redesign the entire inference infrastructure
Example projects
Zebra Medicla Vision (Acquired by Nanox)
- Remote training infrastructure based on Kubernetes
- Adoption of experiment tracking tools
- Cross-research team pipeline tool for ease of experimentation
- Automated large scale runs on data in the wild to detect algorithm anomalies before release
- CI for testing the inference code and packaging it along with the model as a Docker container
3DFY.ai
Large scale, distributed training environment based on Kuberentes and PyTorch Elastic.
Multiple Israeli startups
- Adoption of standard pipeline tools
- Adoption of experiment tracking
- Design and adoption of model reproducibility processes, packaging and versioning
- Design and implement testing methodologies based on individual product, data and algorithm quality risks
Example projects
Stream/Batch processing platform @PayPal
I led a group in charge of a new Big Data Platform for PayPal’s Risk org. The platform enabled PayPal engineers and scientists to author data pipelines combining Stream and batch processing from A-Z: Infra, SDK, QA and and DevOps Tooling.
Technology stack: Spark Streaming, Apache Beam, Kafka, Aerospike, Elastic Search, Graphite, Grafana, ELK and more…
Zebra Medical (acquired by Nanox) - Data platform for Deep Learning (CV)
- Batch pipelines that ingest PB-scale medical imaging corpuses
- Complex data de-identification protocols
- Full-stack annotation system for Medical imaging, as part of the pipeline
- Automated, large scale pre-processing over Kuberentes for image cropping and augmentations
- Data warehouse that enables searching for images by Metadata, as well as retrieving annotations and doing evaluation
Technology stack: AWS, Kuebernetes, MongoDB, ElasticSearch, Postgress, Redis, node.js, angular, and more…
Global Fintech company
- Design the entire data infrastructure for ML group - from events to Feature Engineering
- Implemented feature store solution (online / offline, batch / streaming)
Technology stack: Spark, Kafka, Python, Cassandra, AWS
3DFY.ai - Large scale data preprocessing for Computer vision
- Serverless, Cloud platform
- Perform expensive 3D rendering from meshes and store the results as files/objects
Technology stack: AWS, GCP, Docker, AWS Batch, Python
Several Israeli computer vision Startups
- A serverless data pipeline that ingests images and annotions
- ELT pipeline that transforms this into a searchable data warehouse
Technology stack: AWS, GCP, Athena, Kinesis, Lambda, Snowflake, Redshift, DBT, more…