Resume

Paul James


| | | | | | |:-:|:-:|:-:|:-:|:-:|:-:| | | | | | | | pauljames379 | pjames | @PaulJames379 | paul.james379@gmail.com | +18653095874 |


Summary

Accomplished Data Scientist with extensive R, SQL, ETL, Relational Database, Tableau, and Hadoop Stack experience. Skilled in data quality, data cleaning, statistical modeling, and predictive statistics. A team player with the ability to investigate and mine for results. Effective communicator capable of succeeding in fast-paced environments with meticulous attention to detail.

Areas of Expertise

| | | | | |-|-|-|-| | R and SQL | Data Mining | Machine Learning | Optimization | | Linear/Logistic Regression | Problem Solving | Automated Reporting | Forecasting | | Experimental Design | Data Warehousing | Process Control | Markdown

Professional Experience

Data Scientist

Nissan North America, Inc.2017-present

* *

PopHealthCare2016 * Leverage historical data, statistical modeling techniques, and machine learning to identify potentially high risk Medicare Advantage patients in order to manage their care and prevent adverse health outcomes. * Use SQL procedures, R database connectivity, and R programming to manipulate big data and simplify data processes into automated scripts.

PYA Analytics2015 * Wrote software to automatically generate reports that sped up the data collecting and report writing process from 2 weeks to 15 minutes, and eliminated 40 hours of labor per month. * Adept at using open-source software and tools (MySQL, PostgreSQL, R) to accomplish results completely and effectively; providing streamlined completion while reducing project timelines and costs.

Data Analyst

PYA Analytics2014 * Provided an interactive dashboard solution to large ACOs giving power to decision makers regarding costs and use. The dashboard enabled decision makers to lower costs and improve quality of beneficiary care. * Provided data warehouse quality audit and cleaned datasets for use in proprietary survival analysis tools to evaluate US Marine Corp vehicle status and lifetime expectancy to aid in selection for missions.

Education

Bachelor of Science Business Administration: Business Analytics, Supply Chain Management UNIVERSITY OF TENNESSEE – KNOXVILLE


Projects and Accomplishments

GOAL: Develop a system for predicting which patients will be the most sick and costly in order to improve their health by managing their care. RESULT: Handed a disjoint and disparate process and and was told to automate and centralize it. In under a few months transformed this convoluted process into a streamlined one. It went from hundreds of steps that acted on billions of rows of data across multiple servers, to just a few steps that did the same thing. Not only did this process save over 2 days of work every few weeks, but the data manipulations behind the scenes are optimized and robust. The machine learning model (optimized random forest) is significantly better while being statistically sound and robust. Because of all these factors, the company was able to increase the number of clients they took on which increased the amount of money they made. I’m not privy to a quantitative financial impact, but I was told by the CFO, “We’re really glad you’re here”. TOOLS USED: R, SQL, SSIS, SQL Server, Powershell, Insurance Claims, Optimized Random Forest, Logistic Regression

GOAL: Audit healthcare practice diagnosis coding. The more correct codes billed for, the more revenue received from CMS. RESULT: Reverse engineered a CMS (Centers for Medicare/Medicaid Services) population health scoring algorithm (called HCC) from SAS to R and used it to audit a group of physician practice’s diagnosis coding (ICD9s) for the previous few years. Delivered audited population health scores and expected capitations which gave a clear picture of how their diagnosis coding practices were currently performing and where they needed to improve. TOOLS USED: R, bash, SAS, Excel, CMS HCC algorithm, CMS Public Use files, ICD9

GOAL: Improve ACO’s health system costs and use while at the same time help those same systems increase margins and revenue. RESULT: Implemented the CMMI Bundled Payments for Care Improvement initiative by performing claims data ETL, analysis, and reporting on LDS (CMS Limited Datasets) files for ACOs. Applied bundled payments metrics and algorithms to calculate past and present costs. Delivered a series of dynamic Tableau dashboards to show current, past, and trending costs/use by acute care provider, physician, diagnosis, and post-acute provider from high level aggregate views of the data to very low level line by line costs. This tool provides a clear picture of the total cost of care for a patient across the entire healthcare spectrum to aid in improving healthcare cooperation, cost, and quality. This tool gave these ACO’s a tangible path to save hundreds of thousands of dollars year over year. TOOLS USED: R, Python, Postgres, SQL, Tableau, SparkR, CMS Public Use files, CMS Limited Dataset files, bash

GOAL: Cleanse data and create tools for LOGCOM to quickly access ground vehicle health status, mobility, and expected lifetime. RESULT: Using a US government security clearance to access and analyze United States Marine Corps Logistics Command’s (LOGCOM) Oracle big data databases, was in charge of warehouse data quality assessment, data cleaning, and preparing data extracts for use in statistical analysis and reporting (written report and Tableau dashboards). Quickly diagnosed data quality problems and found alternative solutions. Our team provided LOGCOM tools to analyze their worldwide fleet of ground vehicles to make data-driven decisions with regards to costs, counts, health, mobility, and expected lifespan. TOOLS USED: R, Oracle, bash, VPN, advanced statistical modeling, Tableau, Tableau Server, SQL

GOAL: Create an easy to reproduce data quality report for clinical hospital data and recommend quality metrics to save the world’s largest for-profit operator of healthcare facilities time and money from manually making the report every month. RESULT: Created an automated report-generating software. This software uses Excel to gather report requirements, then uses R, markdown, and R wrappers for JavaScript visualization libraries to automatically generate a well formatted, interactive HTML data quality report. This report is a mixture of text, tables, and visualizations. It’s stand-alone (doesn’t need to be hosted on a website) and portable. This beautiful report exceeded the healthcare operator’s requirements and manipulated the data into summaries for use in Tableau. The automated report saves an employee at this company 1 week every month in data processing and putting this report together manually. Also performed statistical techniques to recommend quality metrics and limits for their clinical data. TOOLS USED: R, Markdown, Dimple.js, DataTables.js, Excel, HTML, CSS, bash