Financial Alignment Demonstrations for Dual Eligible Beneficiaries: A Look at CMS’s Evaluation Plan

MaryBeth Musumeci
Published: Jul 18, 2014

Key Questions

1. What does the law require the Secretary’s evaluation of the demonstrations to include?

Section 1115A requires the demonstration evaluation to assess the quality of care provided, including patient level outcomes and “patient-centeredness” criteria, and changes in Medicare and Medicaid spending.¹ The Secretary can require states and other entities participating in the demonstrations to collect and report information necessary for monitoring and evaluation purposes.² Section 1115A also directs the Secretary, to the extent feasible and based on input from multi-stakeholder groups, to select measures reflecting national priorities for quality improvement and patient-centered care.³ Finally, the law requires that the evaluation results be publicly available “in a timely fashion.”⁴

2. Which research methods will RTI use to evaluate the demonstrations?

RTI will use a combination of qualitative and quantitative methods in its evaluation, including:

Site visits: Two-person teams will make at least two site visits to each state. The first site visit will be within six months of the beginning of demonstration enrollment. RTI’s evaluation plan includes a site visit interview protocol for interviews with state demonstration staff.

Focus groups: Four focus groups of eight to 10 people each will be conducted in each state. Focus group participants will include beneficiaries, family members and informal caregivers. RTI and CMS will determine the timing for focus groups and decide whether to conduct any focus groups in languages other than English. If there appears to be high initial rates of opt out or disenrollment in some states, RTI will consider conducting focus groups with beneficiaries who have made those choices to better understand their decisions. The evaluation plan includes a preliminary focus group outline.

Stakeholder interviews: Interviews will be conducted quarterly by phone or in-person during site visits. There will be up to eight telephone interviews in each state within six months of demonstration implementation and up to eight in-person or telephone interviews in each state per demonstration year. The evaluation plan includes an interview outline for one-hour, one-on-one interviews to assess beneficiary experience in the demonstrations.

Interview participants will include representatives from:

beneficiary and advocacy groups;
the state implementation council;
the CMS-state joint contract management team, state officials, and key demonstration staff;
health plans and health or medical home providers;
entities providing enrollment options counseling to beneficiaries; and
the demonstration ombuds program.

State data reporting system: RTI will collect approximately 130 data elements on an aggregate (not beneficiary) level as part of the evaluation. The evaluation plan highlights complaints, grievances and appeals; disenrollment and opt out rates; information about waiting lists or lags in accessing services; and the rate of change in primary care provider assignment as data “of particular interest.” There are three components to the data collection:

A. Model summary: RTI will prepare a summary of each state’s demonstration, consisting of 21 static data elements, based on the state’s MOU with CMS.

B. Implementation tracking data: States will report quarterly on 45 data elements, with the first quarter beginning on the day of implementation. These data include progress indicators (numerical data reported in monthly increments) and tracking elements by design feature (yes/no responses and brief text descriptions about demonstration progress, successes, and challenges during the quarter).

1. Progress indicators include the number of beneficiaries eligible to participate in the demonstration; currently enrolled; passively enrolled; who opted out prior to enrollment; who voluntarily disenrolled; and whose enrollment ended (e.g., death, loss of eligibility). They also include the demonstration service area; number of 3-way contracts with plans; new CMS initiatives that may affect dual eligible beneficiaries in the demonstration area; and the number of health or medical homes participating in the demonstration and number of enrollees these entities serve (if applicable).

2. Tracking elements include new state policies or procedures to improve service integration; changes in reporting requirements; training or capacity-building activities for plans and providers, including primary care; new policies or procedures regarding care coordination or electronic health records; new or expanded demonstration benefits; activities to increase beneficiary enrollment; major challenges or issues in implementation and solutions developed; activities to engage stakeholders, enrollees, families or advocates in policy development or oversight; tracking and receiving data from plans and providers on new quality indicators; changes in payment methodology for plans and providers; timing of state’s most recent Medicaid Statistical Information Systems (MSIS) submissions; whether plans experienced any problems submitting encounter data; and other successes related to the demonstration.

C. Demonstration impact and outcomes: RTI’s analysis of claims, encounter and assessment data on quality, utilization and cost measures will yield 40 to 50 numerical data fields, updated quarterly. Sources will include MSIS and Medicare FFS claims data, Medicaid managed care organization (MCO) and Medicare Advantage plan encounter data, and the Nursing Home Minimum Data Set (MDS).

3. How will the quantitative analysis in the evaluation be structured?

Comparison group: RTI’s quantitative analysis for the evaluation will use an “intent-to-treat” approach, comparing beneficiaries eligible for each state’s demonstration with a similar population of beneficiaries who are unaffected by the demonstration. All eligible beneficiaries will be included in the demonstration group, regardless of whether they actively participate. The geographic area from which the comparison group will be drawn will be determined based on how it compares with the demonstration area in terms of population characteristics (e.g., age, income, race/ethnicity); market characteristics, such as provider supply; the size of the population meeting the demonstration’s eligibility criteria; Medicare and Medicaid spending per dual eligible beneficiary; the shares of long-term services and supports (LTSS) delivered in facility and community-based settings; and the extent of Medicare and Medicaid managed care penetration.

RTI will first consider whether it is possible to use an in-state comparison group for each state’s demonstration. If a demonstration is statewide or if the excluded regions of a state are not representative of the areas included in the demonstration, the comparison group will be out-of-state or possibly a combination of in- and out-of-state beneficiaries. Because all or most of their dual eligible population will be included in their demonstrations, Washington and, most likely, Massachusetts will have out-of-state comparison groups, while RTI will consider in-state comparison groups for Illinois, Ohio, and Virginia. The comparison group geographic areas will be determined within the first year of demonstration implementation, while the comparison group members will be determined retrospectively at the end of each demonstration year.

Claims data: RTI will analyze available Medicare and Medicaid data quarterly for selected quality, utilization, access to care, and cost measures and Nursing Home MDS data for facility admissions. The analysis for each state’s demonstration and comparison groups will include a two year pre-demonstration baseline period and each demonstration year. RTI’s evaluation plan notes that as of August 2013, about one-third of potential demonstration and comparison group states had not yet submitted MSIS data for the second quarter of FY2012, meaning that claims data lagged more than one year. Ideally for RTI’s evaluation, MSIS data will be finalized within four to six months after the end of each quarter.

Encounter data: RTI also notes the evaluation’s need for encounter data from Medicare Advantage plans, non-demonstration-related Medicaid MCOs, and demonstration health plans in areas such as patient diagnosis, service intensity (brief vs. comprehensive visits), type of visit (preventive vs. treatment), ancillary services, and facility changes. Data also will be assessed for changes in coding patterns, given that capitated payments may be affected by coding intensity. RTI notes that the quality of encounter data is not yet known, Medicare Advantage plans have up to one year from the service date to submit data, and Medicaid managed care data is expected to vary by state. Consequently, the evaluation may be limited if data is incomplete or unavailable on a timely basis.

Beneficiary surveys: RTI will not conduct beneficiary surveys as part of its evaluation; however, RTI asks that findings from any surveys fielded by states, CMS, or other entities be shared for inclusion in the evaluation. Health plans in the demonstration must participate in the Medicare Health Outcomes Survey (HOS) and report Healthcare Effectiveness Data and Information Set (HEDIS) measures and the Medicare Consumer Assessment of Health Care Providers and Systems (CAHPS) survey. In addition, CMS’s demonstration operations support contractor will administer a beneficiary CAHPS survey in the managed FFS states. RTI notes that ideally all surveys would use a standard instrument and believes that the CAHPS instrument for assessing patient-centered medical homes seems most appropriate.

Research questions related to quantitative analysis:

Which demonstration model (managed FFS or capitated) has achieved greater savings?
Are there differences in key outcomes (e.g., quality, utilization, expenditure types) that can be attributed to the type of financial alignment model used?
Do the effects achieved by alternative integrated care models occur equally fast? Or, does one model (managed FFS or capitated) achieve gains more quickly than the other?
Does the approach to enrollment (e.g., passive) affect access to care and costs?
How does the relative degree of care management intensity and diversity across services affect outcomes?
Do these effects vary across subgroups of beneficiaries?

4. What evaluation reports will RTI produce for CMS and on what timeframe?

RTI’s evaluation reports will include:

State-specific initial reports, which will be qualitative and based on the first six months of implementation;

Quarterly reports for CMS and states’ ongoing demonstration monitoring, with preliminary information on enrollment, disenrollment, quality, utilization, and cost measures;

Annual reports, which will include descriptive statistics for each state’s demonstration and comparison group with estimates for beneficiary experience, utilization, access to care, cost, and quality measures. Changes in measures across years or subgroups within years will be noted (e.g., total costs (for Medicare and Medicaid separately), rates of primary and specialist care use, rates of avoidable hospitalizations and inappropriate readmissions, counts of hospital and nursing facility admissions and length of stay, rates of home and community-based services (HCBS) use, and mortality);

Final aggregate evaluation report, which will seek to determine the relative effectiveness of states’ demonstration design choices and study sources of variation at the state level; and

Final evaluation reports for each state, which will assess the demonstration’s overall impact on quality, utilization and cost measures relative to the comparison group.

While the law requires the evaluation results to be publicly available, the evaluation plan does not specify which of these reports will be released publicly.

5. How will demonstration implementation be evaluated?

RTI will profile each state’s care delivery system prior to the demonstration, identify key elements that the state’s demonstration intends to change, and measure the effects of any changes. The evaluation will describe major demonstration design features in each state and compare those features across demonstration states. The design features also will be used to identify demonstration characteristics associated with better outcomes in the quality, utilization, access to care, and cost analyses (described below).

The evaluation will examine how care coordination in the demonstration is structured, how prescriptive the state is in setting care coordination expectations in health plan contracts, how demonstration care coordination compares with that in other capitated programs serving other populations, and whether care coordination is person-centered. If possible, the evaluation will categorize the intensity and scope of mandated care coordination functions across all demonstration states. Information to evaluate demonstration implementation will be gathered from document review, stakeholder interviews, and state-reported data.

Demonstration Design Features to be examined:

The demonstration’s integrated delivery system (e.g., primary care, including medical or health homes; LTSS; behavioral health; developmental disability services);
Integrated delivery system supports (e.g., care team composition; use of health information technology at the state, provider, and plan level);
Care coordination/case management (e.g., assessment, service planning, and care management stratification processes);
Benefits and services (e.g., scope, new or enhanced);
Enrollment and access to care (e.g., integrated enrollment and care access; provider accessibility standards; opt out, disenrollment, and auto-assignment policies);
Beneficiary engagement and protections (e.g., state policies to integrate Medicare and Medicaid grievances and appeals, quality management systems); and
Financing and payment elements (e.g., financing model, incentives, shared savings).

Research questions related to implementation:

What are the primary features of each state demonstration and how do they differ from the state’s previous system available to the demonstration-eligible population?
To what extent did each state implement the demonstration as proposed?
Which states were able to fully implement their intended proposals?
Were certain models more easily implemented than others?
Were the demonstrations more easily implemented for certain subgroups?
What factors contributed to successful implementation?
What were the barriers to implementation?
How have beneficiaries participated in the ongoing implementation and monitoring of the demonstrations?
What strategies used or challenges encountered by each state can inform adaptation or replication by other states?

6. How will beneficiary experience be evaluated?

RTI’s evaluation of beneficiary experience in the demonstrations will include impact on quality of life, health outcomes, access to needed services, service integration and coordination across settings and delivery systems, provider choice, beneficiary rights and protections, and the delivery of person-centered care. Data to evaluate beneficiary experience include beneficiary focus groups; stakeholder interviews; monitoring of beneficiary engagement activities, grievances and appeals, and feedback from demonstration ombuds programs; claims data analysis on key quality, utilization and access to care measures; and the results of any beneficiary surveys performed by states, CMS or other entities. Focus groups and stakeholder interviews to assess beneficiary experience will include beneficiaries, relatives, and advocates but not service providers or anyone who oversees the demonstration.

Research questions related to beneficiary experience:

What impact do these demonstrations have on beneficiary experience overall, by state, and for beneficiary subgroups?
What factors influence the beneficiary enrollment decision?
Do beneficiaries perceive improvements in their ability to find needed health services?
Do beneficiaries perceive improvements in their choice of care options, including self-direction?
Do beneficiaries perceive improvements in how care is delivered?
Do beneficiaries perceive improvements in their personal health outcomes?
Do beneficiaries perceive improvements in their quality of life?

7. How will utilization and access to care be evaluated?

RTI will analyze the pre-demonstration (two years prior to implementation) and annual utilization rates during the demonstration of Medicare and Medicaid-covered services in each state to determine the demonstration’s effects on type and level of service use, ranging along a continuum from facility-based to home-based care. The evaluation also will calculate the average utilization rates for the pre-demonstration period and at the beginning, middle, and end of the demonstration. Utilization rates for each state will be stratified by hierarchical condition categories scores or health status measures. Nearly all utilization analyses will be conducted at the beneficiary level.

RTI also will analyze patterns of primary vs. specialty care use, hypothesizing that primary care physicians will provide an increasingly higher proportion of visits in the demonstration group relative to the comparison group over time, unless non-visit compensation is provided to physicians in the comparison group. This analysis will account for the fact that specialists may provide primary care for people with chronic conditions. RTI also will explore measures to assess fragmentation of care for behavioral health and LTSS. RTI notes that the utilization and access to care analysis may be limited by potential problems with encounter data, lack of care coordination data, and incomplete behavioral health services data.

The evaluation of utilization and access to care in Massachusetts’ demonstration will include a focus on mental health and substance use disorder prevention and treatment services; community support services; and dental, vision, and non-medical transportation services, which the demonstration is expanding.

Research questions related to Utilization:

What is the impact of the state demonstrations on utilization patterns during the course of the demonstration?
What is the impact on hospital and nursing facility admission rates, potentially avoidable hospitalization utilization rates by setting, and LTSS utilization rates? What is the impact of the demonstration on hospital and nursing facility length of stay?
Do demonstrations change the balance between HCBS and nursing facility use, the types of enrollees who use these services, and utilization rates by type of HCBS such as personal care? Do enrollees receive more HCBS as a result of the demonstrations?
Is any impact short term (lasting for only one year prior to returning to pre-demonstration level, increasing over time, reaching a plateau after a year or two)?
Does the observed impact vary by health condition or other beneficiary characteristics?
Will case management or care coordination lead to lower hospital admission rates or, if admitted, shorter lengths of stay and shorter nursing facility and home health care episodes?
Are demonstration group members using fewer inpatient services and more ambulatory services?
Is the impact greater for more medically complex (multiple chronic condition), high cost (top 10%) enrollees?

Research questions related to Access to Care:

Do demonstration participants experience increases in the mean number of primary care visits and increased visit rates by specialty type?
Does acuity on admission to nursing facilities increase? Do discharge rates back to the community from nursing facilities increase? Is there an increase in the proportion of HCBS users self-directing care?
Does the mental health outpatient utilization rate increase? Does the outpatient substance use disorder service utilization rate increase?

8. How will quality of care be evaluated?

RTI’s evaluation will analyze a set of quality measures common to all demonstration states (listed below).⁵ Many are HEDIS measures that demonstration health plans must report, although similar reporting is not required in the comparison states, with the result that these data will not exist for beneficiaries outside of the demonstration. In addition, state-specific quality measures will be finalized within six months of implementation. The evaluation plan calls for rapid-cycle monitoring, although the timeliness of encounter data submission by health plans is not yet known. RTI also will develop variables to control for observable differences between individual beneficiaries, both within the demonstration group and between the demonstration and comparison groups; at minimum, these will include demographic information, such as age, race, and sex. The quality measures listed below will be supplemented with information about beneficiary quality of life, satisfaction, and access to care (described above) and any relevant available beneficiary survey information.

Evaluation Quality Measures:

30-day all-cause risk-standardized readmission rate
Influenza immunization
Pneumococcal vaccination for beneficiaries age 65 and older
Ambulatory care sensitive condition admissions – overall composite
Ambulatory care sensitive condition admissions – chronic composite
Preventable emergency department visits
Emergency department visits, excluding those resulting in inpatient admission or death
Admissions with primary diagnosis of severe and persistent mental illness or substance use disorder
Follow-up after hospitalization for mental illness
Screening for clinical depression and follow-up
Cardiac rehabilitation following hospitalization for cardiac event
Percent of high-risk long-stay nursing facility residents with pressure ulcers
Screening for fall risk
Initiation and engagement of alcohol and other drug dependence treatment
Adult body mass index assessment
Annual monitoring for patients on persistent medications
Antidepressant medication management
Breast cancer screening
Comprehensive diabetes care – selected components
Controlling high blood pressure

9. How will cost be evaluated?

RTI’s evaluation will identify high-level cost measures that can be calculated for all states to monitor changes in cost over time. RTI notes that the evaluation will use a regression-based approach to determine cost savings, which will provide information about how various factors relate to costs. ⁶ In the capitated models, costs will include per member per month rates, combined with costs for beneficiaries who opt out or disenroll. RTI will measure pre-demonstration and annual spending on beneficiaries for both Medicare and Medicaid, although RTI anticipates that only Medicare costs may be available for most states in the first annual report. In the capitated models, RTI anticipates that service-level spending will not be available from the encounter data reported by health plans so the utilization analysis described above will be the means to understand the demonstration’s impact by type of service. In its annual reports, RTI will present costs for various subgroups of interest, such as demographic groups, LTSS users, beneficiaries with intellectual/developmental disabilities, those with end-stage renal disease, and those with chronic conditions such as diabetes, and will test for differences across demonstration years. The final evaluation report will include cost impact analysis using comparison groups. RTI notes that the availability and timeliness of encounter data will affect the cost analysis.

Research questions related to Cost:

Do the demonstrations reduce costs?
If so, how were the demonstrations able to reduce the costs of demonstration enrollees compared with the comparison group?
How do the demonstrations differentially affect expenditures for beneficiaries at risk for having high costs?

10. How will subpopulations and health disparities be evaluated?

RTI will work with CMS to identify subpopulations to analyze in each state, based on whether the state’s demonstration is targeted to a particular population, the size of subpopulations participating in the demonstration and how they are distributed across states, and how subpopulations can be identified in data sets in the demonstration and comparison groups. Possible subpopulation groups include racial and ethnic groups, people living in rural or inner-city areas, younger people with disabilities, people age 65 and older, people with serious and persistent mental illness, people with developmental disabilities, people with end-stage renal disease, people with multiple chronic conditions, LTSS users, and high-cost beneficiaries. The evaluation will not analyze all subpopulations in every state. Data sources for subpopulation analysis will include beneficiary focus groups and interviews with beneficiaries, state officials, and health plans with large subpopulations. Questions will include whether health plans refer beneficiaries to community services such as the Supplemental Nutrition Assistance Program and senior centers; have established protocols for treating common medical and non-medical problems among subpopulations; and have procedures to address the needs of people with limited English proficiency; and which features stakeholders believe are most effective in these areas.

Subpopulations of focus in Massachusetts’ demonstration: people with end-stage renal disease, those receiving behavioral health services, those with chronic physical conditions (estimated to be about 40% of the demonstration eligible population), and those receiving LTSS (including people with developmental disabilities in the community who are not receiving home and community-based wavier services). Other potential subpopulations in Massachusetts include people in facilities, people with high activities of daily living needs living in the community, and people with high behavioral health needs living in the community. RTI will compare characteristics of people who enroll in the demonstration with those who are eligible but do not enroll.

Subpopulations of Focus in Washington’s managed FFS demonstration: eligible beneficiaries who receive different levels of health home services, ranging from no to intensive use. Groups also will be divided based on amount of time enrolled in a health home.

Research questions related to Subpopulations and Health Disparities:

How do the demonstrations, as implemented by the different states, address the unique needs of subpopulations? Are there special initiatives designed to meet the needs of these populations (such as special care coordination efforts, new services for people with serious and persistent mental illness, nursing facility diversion programs)? Do the demonstration states successfully implement what they proposed? Do the models that focus on subpopulations work better than those that are designed for more general populations?
Do the demonstrations reduce expenditures and improve beneficiary experience, quality of care, and health outcomes for subpopulations? What is the effect on service use?
Do the demonstrations reduce or eliminate undesirable disparities (such as between African Americans and whites) in access to care, beneficiary experience, health care utilization, expenditures, quality of care, and health outcomes?
To the extent that the demonstrations have positive outcomes for subpopulations, what features of the demonstration account for these outcomes?

Examples of measures for people with behavioral health conditions: outpatient services; HCBS; new long-term nursing facility admissions for people with serious and persistent mental illness; access to a full range of scheduled and urgent medical and behavioral health care and LTSS; beneficiary reports of improved quality of life as a result of access to a full range of services; beneficiary choice of medical, behavioral health and long-term care services and providers; beneficiary reports on life satisfaction; care coordination assessment processes that integrate medical, behavioral health and LTSS; hospitalizations for people with serious and persistent mental illness, outpatient visits after hospitalizations for mental illness, and the initiation and engagement of alcohol and other drug dependence treatment.

Examples of measures for nursing facility residents: admission rates, acute care utilization (physician visits, hospitalization, emergency room), and cost patterns for short and long-term stays; acuity level in new admissions to evaluate the extent to which the demonstrations are successfully maintaining frail beneficiaries in the community; and selected nursing facility quality measures. Trends in admissions and quality will be monitored within demonstration and comparison states.

Looking Ahead

As enrollment continues in the financial alignment demonstrations, interest in how these models will be evaluated will remain high. Federal and state policymakers, health plans, providers, beneficiaries, and other stakeholders will want to know whether and how the demonstrations are achieving their stated goals over the short and long-term. CMS, through its contract with RTI, has set out its plans to evaluate the demonstrations in a number of areas, including implementation, beneficiary experience, utilization and access to care, quality of care, cost, and health disparities among subpopulations, at the aggregate and state levels. While the law requires the evaluation results to be publicly available, the evaluation plan does not specify which of the various reports produced will be released publicly. The evaluation plan also acknowledges that the analyses may be limited by the quality and timeliness of available claims and encounter data. In addition, there may be areas of interest that the evaluation does not fully assess, such as states’, plans’, and providers’ efforts to make their services, policies, and practices accessible to beneficiaries with disabilities. CMS’s release of its evaluation plans makes this information available so that stakeholders and the public can better understand how the demonstrations will be measured. As the evaluation progresses, it will be important for timely results and reports to be publicly available to promote broad discussion of the demonstrations’ successes and challenges among stakeholders.

Introduction