Data Scientist, REMOTE (USA), AI, Data Fusion

Data Scientist

We are looking for a driven Data Scientist to work with our client who is dedicated to improving global commerce. Their AI platform helps to fund and strengthen government institutions, disrupt transnational crime, and distribute the benefits of global commerce more broadly and inclusively. This is a business that employs and advances the latest machine learning and data engineering technologies to help tackle some of the central challenges of our time.

This is a company that, through artificial intelligence, is unlocking the power of global economic data to make trade safer, more efficient, and more profitable.

They are building intelligence, which includes the world’s most comprehensive representation of global commerce activity. This data asset, composed of billions of records, covers more than 40% of cross-border transactions, corporate ownership registries in over 100 countries, the global movements of goods, illicit web activity, and more. Built on this foundation, their proprietary machine learning technologies and products are designed to help customers manage risk, automate otherwise labor-intensive investigations, and better manage cross-border flows.

We are looking for a talented Data Scientist to help build this vision. You’ll work closely with engineers on projects to analyze and observe world-scale datasets and write code and models that can scale to produce never before seen insights.

This position can be worked remotely, or from the company headquarters located in NYC.


  • Analyze global trade networks, using techniques from web/social/economic network analysis
  • Apply cutting edge classification, regression, and clustering techniques, including deep learning, to handle high dimensional feature and outcome distributions
  • Train your models across hundreds of millions to billions of observations
  • Work with data in English, Spanish, Portuguese, Chinese, Russian, Arabic, and more
  • Build performant models that deliver high-quality results when applied to non-stationary and adversarial distributions
  • Use unsupervised and semi-supervised techniques in cases of low outcome-data availability
  • Work with engineers to integrate your models into robust and performant data pipelines
  • Opportunity to work with the top technical and domain experts on our advisory board, including Matt Jackson, Stanford professor and leading expert on economic networks
  • Collaborate with fellow engineers and data scientists across the organization


  • B.S., M.S., or Ph.D. in an engineering or quantitative discipline, or equivalent work experience
  • 3+ years industry experience
  • Expertise in machine learning and classical statistical analysis
  • Experience with agile development practices and Git version control
  • Ability to evaluate solutions in terms of business impact in addition to traditional stats or ML criteria
  • You have the ability to take ownership and iterate on a project through completion
  • You care deeply about machine-learning excellence, clean code, and knowledge-sharing
  • You have strong written and verbal communication skills

Nice to have, but not required

  • Expertise in one or more of the following: natural language processing, deep learning, computer vision, or network analysis
  • Experience with relational and graph databases
  • Experience with docker and kubernetes
  • Experience in machine learning model deployment
  • Working knowledge of cloud services like AWS, Azure, or GCP

Technologies we love

  • Languages: Python, Go, Java
  • Tools: Docker, Git, Airflow, Ansible, Swagger/OpenAPI, Dask
  • Datastores: Postgres, Redshift, MySQL, Elasticsearch, Neo4j

Why it’s great to work here

  • We love to collaborate, and we win as a team!
  • We are committed to engineering excellence
  • We value personal and professional development
  • We learn from diverse backgrounds and perspectives
  • We impact the world, from enabling developing countries to identifying drug traffickers

We are an equal opportunity employer with a commitment to inclusion across race and ethnicity, gender, sexual orientation, age, religion, physical ability, veteran status, and national origin. We offer a comprehensive healthcare package and paid parental leave of 2 months for the primary caregiver and 1 month for the secondary caregiver.