Data Scientist

McLean, VA

Job ID: 132386 Industry: Government


Data Scientist

Candidate must have a TS/SCI with polygraph security clearance.

Program Description:

This position will support the IT Operations organization, which is responsible for Monitoring and Communicating IT Infrastructure Status; Managing IT incidents, IT Service Management, Developing Monitoring Application and Solutions; Providing Help Desk and Application Support Services, Risk Management, and Cyber Security Response.

The candidate selected for this position will support the Technical Director to create a Data Analytic environment for AITOPS. It must have all the tools and a catalog of data and sources to be used for analyzing and identifying patterns of failures. These patterns will be fed into Machine Learning to recognize future patterns of failures. It must contain a repository to store the analytic results and visualization tool to display them.

Day to Day Responsibilities:

The candidate will apply knowledge of C2S to create scripts for standing up instance for the above analytic run with different requirements of compute and storage. These scripts are well documented and ready to be called for execution from a code repository such as GitHub.

The candidate must understand the Data Lake concept with all the data requirement, data update and access control. It will be created in C2S as needed to support the analytic runs.

The sources of data as required above reside among others, in SPLUNK and Network management tools. The candidate must be comfortable navigate the environments above to identify the necessary data for the analytic runs.

The candidate must be able to architect a production environment that receives live streaming data from different sources and a Hadoop like engine such as SPARK to process the data to detect possible patterns of failures to alert IT Operations of incoming outages for it to apply corrective measure to prevent or mitigate the impact of outage.

Providing Predictive Alerts to IT Operations is the ultimate goal and this Data Analytic environment helps achieve it.

Day to day responsibilities may include:
  • Work with the Technical Director and staff to conceive, define and build a new environment for data analytics in C2S
  • Develop scripts to efficiently standup and teardown the cloud environment as the concept is further defined.
  • Assist in setting-up the Hadoop like environment
  • Work with stakeholders throughout the organization to identify opportunities for leveraging organizational data to drive business solutions.
  • Mine and analyze data from the Sponsor' s databases to drive optimization and improvement of business strategies
  • Define and create a repository for machine learning result storage
  • Use predictive modeling to increase and optimize customer' s business outcomes.
  • Develop processes and tools to monitor and analyze environment performance and data accuracy.

Required Skills:
  • Demonstrated experience with: Data warehouse database and enterprise system architecture
  • Understand the concept of pedigree, reliability and confidence of data.
  • Familiar with data storage technologies
  • Understand enough of statistical research techniques to prepare and provide data to Data Scientists
  • Strong programming ability, including experience programming in one or more programming languages: Java, JavaScript, Python, R, SQL etc.
  • Knowledge of a variety of machine learning techniques (clustering, decision tree learning, artificial neural networks, etc.) and their real-world advantages/drawbacks to support the Data Scientists
  • Demonstrated strong negotiation, problem-solving and analytical skills.
  • A Bachelor' s degree or equivalent experience (3 years for Bachelors) in Computer Science, Information Systems, or a scientific discipline.
  • Minimum of 10 Years' relevant experience, 3 years in Intelligence Community

Desired Skills:
  • Adept at Amazon Cloud environment creation, assessment and authorization (A&A) approval, and maintenance.
  • Excellent communication, and interpersonal skills, and the ability to communicate effectively with third parties and internal staff at all levels of the organization.
    • Knowledge and experience in statistical an~ data mining techniques: GLM/Regression, Random Forest, Boosting, Trees, etc.
    • Experience querying databases and using statistical computer languages: R, python, SLQ, etc.
    • Experience using web services: Redshift, S3, Spark, DigitalOcean, etc.
    • Experience creating and using advanced machine learning algorithms and statistics: regression, simulation, scenario analysis, modeling, clustering, decision trees, neural networks, etc.
    • Experience with distributed data/computing tools: Map/Reduce, Hadoop, Hive, Spark,
    • MySQL, etc.  

Job Type: FT

Not ready to apply?

Send an email reminder to:

Share This Job:

Related Jobs: