Data governance can be defined as both an organizational process and a structure; it establishes responsibility for data, organizing program area staff to collaboratively and continuously improve data quality through the systematic creation and enforcement of policies, roles, responsibilities, and procedures. As a structure, clear and specific roles and responsibilities are assigned and staff are held accountable for the quality of the data they manage. Ultimately though, data governance is not about who is in charge: it is about identifying existing or potential data problems and fixing them to prevent them from happening or recurring. As a continuous and iterative process, data governance is a systematic way of handling data throughout the information life cycle, from definition to retirement.
The P20 WIN Data Governance Manual details the structure, processes, roles and responsibilities that define its governance. This manual expands on the previous P20 WIN Data Governance Policy, which empowered the P20 WIN Data Governing Board to develop more detailed standards and processes for data management. This is a living document, and will be updated as agency and state priorities evolve. If a new policy is created or an existing policy changed, the Manual will be updated to include a copy of the new policy for reference.
Connecticut developed P20 WIN so that multiple, interagency datasets could be linked securely and simultaneously to create longitudinal views of student experiences into the workforce. Since 2014, the P20Win system has produced interagency data linkages for over twenty projects.
The following list delineates key goals and deliverables for P20 WIN:
In order to successfully coordinate and secure authorized access to data through P20 WIN within state and federal laws and regulations, documents and processes have been developed and approved by participants to the system.
The P20 WIN Data Governance structure has three governing bodies: the Executive Board, the Data Governing Board, and the Resident Advisory Board. These groups represent a three-tier leadership system. The Executive Board has ultimate policy decision-making authority for the P20 WIN Data Sharing process. The Executive Board can create standing or special committees (e.g. legal committee) when necessary. The Data Governing Board enforces policies set forth by the Executive Board related to cross-agency data management, including but not limited to, data confidentiality and security in conformance with applicable law. Finally, the Resident Advisory Board advises and guides the Executive Board and the Data Governing Board on how to communicate the work and mission to State residents.
Together these groups provide a framework for system leadership, implementation and improvement. Independent of the governance structure, Data Stewards are the staff at the Participating Agencies that work closely with the data available for request. Data Stewards from each Participating Agency are encouraged to join the regular Data Stewards meetings to focus on the implementation of system policy and to determine the availability, security and quality of data.
Image
The Executive Board is a multi-agency committee composed of Commissioners and Executive leadership from each of the Participating Agencies. The Executive Board develops the vision for P20 WIN and provides oversight and leadership for the data governance structure. The Executive Board is responsible for sustaining the P20 WIN system by supporting its vision and securing resources to maintain its efficient operation.
While membership within P20 WIN is not restricted to Executive Branch state agencies, the P20 WIN Executive Board Chairperson must be a State official or employee. As Chairperson, they conduct all Executive Board meetings, represent the P20 WIN Data Sharing process, and work with all Participating Agency leaders and political leaders to assure agency-to-agency coordination and to further data sharing to improve services provided to the residents of Connecticut. The Chairperson leads the Executive Board to set the direction for the Data Sharing process and coordinates with the Operating Group to set agendas and resolve operational matters.
Responsibilities:
The Executive Board meets quarterly on the third Tuesday of the month at 2:00pm, unless otherwise noted. You can find agendas, minutes, and recordings of past meetings in the Governance page of the P20 WIN website.
The Data Governing Board consists of one staff member from each Participating Agency who can deliver policy, resource and staffing recommendations to support the Participating Agency in the Data Sharing process. Members work collaboratively to maintain the security of the Data Integration Hub. Once policies and resource commitments are approved by the Executive Board, the Data Governing Board members are responsible for implementing and enforcing these policies.
Responsibilities:
Member Expectations:
The Data Governing Board meets every fourth Friday at 8:30am. Meeting agendas, minutes and recordings of past meetings can be found in the Governance section of the P20 WIN website.
Data sharing by governmental agencies should incorporate the knowledge and expertise of the state residents. Many state agencies have established advisory groups seeking to gain insight from the community and their lived experiences. The Office of Early Childhood has a Parent Cabinet and the 2Gen Advisory Board with a parent engagement component. The Department of Education created the Commissioner’s Roundtable for Family and Community Engagement in Education. The Department of Children and Families has a Data Integration Work Group.
The Executive Board will establish a Resident Advisory Board with members representing residents of Connecticut who receive or have received state services and benefits.The role of Resident Advisory Board is to provide advice and guidance to the Executive Board and the Data Governing Board on how to effectively communicate its work and mission to State residents.
Responsibilities:
In alignment with the policies set forth by the Executive Board and Data Governing Board, the Data Stewards work on technical implementation of the Data Sharing system and are responsible for the availability, security and quality of data shared through the Data Integration Hub.
The Data Stewards working group also recommends to the Data Governing Board policies or practices to be developed or improved. Members are then responsible for carrying out the approved data system policies. Data stewards not only represent the interests of their agency, but work to support the state’s overarching vision for the Data Sharing process.
Responsibilities:
Together the P20 WIN data governing bodies work in concert to ensure that the following processes operate smoothly.
Process | Details |
---|---|
Response to data requests | Requests for data from P20 WIN are directed from the Operating Group to the Data Governing Board, which manages the life cycle of the request. This process is outlined in Section 7 of this manual as well as the Data Sharing Playbook. |
Determination of authorized users and access rights to resultant datasets | In accordance with state and federal law and each data request’s Data Sharing Agreement, the Data Governing Board members representing agencies whose data are in a data request (i) approve the users who are to have access to de-identified unit record data from P20 WIN and (ii) establish the parameters for data dissemination and destruction for each specific data request. |
Development and maintenance of cross-agency data dictionary | Data Stewards are responsible for ensuring that the data dictionary for each Participating Agency is complete and up-to-date. |
Establishment of guidelines for data analysis as necessary | The Data Governing Board establishes processes that support a common approach to data analysis for resultant data sets as appropriate. |
Expansion of P20 WIN | The Executive Board approves a process for adding new state agencies or organizations or additional data so that the technical infrastructure can be expanded and new agencies have representation in the named committees. The process for a new agency to join P20 WIN can be found in Section 3 of this manual or in Appendix 1 of the eMOU. |
Creation of policies to sustain P20 WIN | The Executive Board establishes policies to sustain and improve P20 WIN including how P20 WIN will be staffed and supported financially. |
Modification to this policy | The Data Governing Board can make recommendations to modify P20 WIN policies and the eMOU to the Executive Board. |
Any agency that is interested in joining P20 WIN must first submit a memo regarding their intent to join. This document should address the agency’s authority to access and share data and the potential security, privacy, confidentiality, or conflict of interest concerns that might be raised if the agency participates in the Data Sharing process.
Any agency interested in joining P20 WIN will go through the following process:
The official process to join can also be found in Appendix 1 of the Enterprise Memorandum of Understanding, which is included at the end of this document.
The Operating Group and Data Integration Hub share responsibility in hosting and operating P20 WIN, with the Operating Group serving as the lead agency responsible for the operations and the Data Integration Hub responsible for hosting, linking and transmitting Data.
The Office of Policy and Management serves as the Operating Group for P20 WIN. The Department of Labor serves as the Data Integration Hub and provides data matching services for approved data requests.
The Operating Group facilitates smooth and efficient operation of P20 WIN for the benefit of the Participating Agencies and the greater benefit of the State of Connecticut. The Office of Policy and Management serves as the Operating Group for P20 WIN.
Responsibilities:
The Department of Labor serves as the Data Integration Hub and provides data matching services for approved data requests.
Responsibilities:
The State Department of Education (SDE), the Office of Early Childhood (OEC), the Department of Labor (DOL), the Connecticut State Colleges and Universities (CSCU), the University of Connecticut (UCONN), the Connecticut Conference of Independent Colleges (CCIC), the Department of Social Services (DSS), the Department of Children and Families (DCF), the Office of Higher Education (OHE) and the Connecticut Coalition to End Homelessness (CCEH) are actively collaborating to support the P20 WIN System. The system is designed with flexibility to expand and include additional state agencies or organizations such as the Department of Public Health and the Department of Housing. The process for an agency or state organization to join P20 WIN is defined in the “How to Become a Participating Agency” within this manual.
Each participating agency is responsible for the data that reside in their respective systems. The Data Governance Policy does not affect data that is not shared through P20 WIN. Rather, this policy covers only the data that is shared between organizations both at the unit and aggregate level through P20 WIN. While data will be transported, matched and eventually stored electronically, this manual covers the use of shared data in reports and documents whether electronic or in print.
The previous sections outline the key management roles and responsibilities to support and maintain cross-agency data sharing. The following sections describe key data policies that promote data quality, data accuracy, and acceptable use standards.
The quality of the data in the P20 WIN ultimately determines the system’s overall value to state officials and external researchers. The National Center for Education Statistics (NCES) notes that key attributes of data quality are accuracy, completeness, timeliness, validity, and consistency1. Data inaccuracy refers to errors in records that are often the result of poor data entry practices and poor regulation of data accessibility2. Data completeness refers to the degree to which all data is available, which is often measured by the number of missing records3. Data validity refers to whether the data fields actually measure and reflect the systems or outcomes of interest. Data consistency refers to the degree the data is stable across both time and sources.
Integrated data systems must account for data inconsistencies across agencies. For example, an individual’s race as tracked in DSS Medicaid tables may not match the CCEH HMIS race identifier. Nonetheless, strong data governance policies can increase the value of such systems for key stakeholders. The Minnesota’s SLDS Governance Manual notes that “The key to maintaining high quality data is a proactive approach to data governance that requires establishing and regularly updating strategies for preventing, detecting, and correcting errors and misuses of data.”
P20 WIN Participating Agencies providing data do not ensure 100% accuracy of all fields and each Data Recipient accepts the quality of the data as received.4
The Executive Board can establish a data quality committee to introduce strategies to improve data standards. Other states have suggested participating agencies conduct regular data quality audits. For example, Virginia’s State Longitudinal Data System research sub-committee encourages participating agencies to develop some basic data quality reports that show range of values and frequency of missing values to accompany the master Data Dictionary. Other strategies include automatic edit checks that crosscheck the same field from year to year, reports on matching rates, and clean metadata for all possible extracts in the SLDS.
To facilitate identity resolution in an interagency data request, each Participating Agency must structure the input file so that there is one unique record per Individual represented. Data that may have been pulled from a dimensional data source will be flattened to produce an input file in this format. Additionally, the Participating Agency should structure the file with a unique generic ID to each unique record in the file that identifies the Participating Agency which is providing the information.
Participating Agencies Data Stewards should follow the standardized formatting requirements when preparing a data file for matching. The Data Integration Hub has the right to reject any data that does not meet these standards.
For matching files – Files should be in a tab delimited text file. There should not be quotations around text fields.
For analytical files – Files should be in the format requested and approved for the given data request.
The file naming standard for data files transferred via SFTP is as follows:
a. Agency acronym
b. Data Request Number listed in DSA
c. Date file was created (MM_DD_YYYY)
d. Contents of file (matching, analytical, key)
example: “DOL.000.01_01_2000.matching”
The table below identifies the most commonly requested data variables.
Field/Data Element | Format | Possible Values | Definition | Comment |
---|---|---|---|---|
SSN | Text or Varchar(9) | Social security # | With leading zero | |
DOB | YYYYMMDD | YYYYMMDD | Date of Birth | As a text field |
Gender | Text or varchar(1) | M = male, F = female, U = unknown | Gender | |
Race/Ethnicity | ||||
LastName | Text | |||
MiddleName | Text | |||
FirstName | Text | |||
SASID | Text or varchar(10) | State Assigned Student Identification Number | ||
HighSchoolCode | Text or varchar(6) | CollegeBoard HS assigned code | Include leading zero | |
Fake ID | A unique fake identifier to be used |
Metadata is structured data about data. High quality metadata provides helpful context about the data’s creation, quality, and uses and is key to improving data discovery. Metadata helps to answer the question “what is the data about?” by providing more detail about various characteristics of the data, including information about the data source, update frequency, and level of detail. Dataset metadata elements should include:
Related guidance for metadata on the open data portal can be found here. Participating Agencies should provide key metadata to accompany the interagency Data Dictionary fields. The Data Governing Board is responsible for developing, documenting and monitoring Data Definitions and Metadata for shared Data Elements within the cross-agency Data Dictionary. The Operating Group works with the Data Governing Board to ensure that the Data Dictionary for each Participating Agency is complete and up-to-date.
Cross-agency data sharing is for legitimate governmental purposes, which can include evidence-based policy-making, academic research to support the wellbeing of state residents, and mandated, periodic state reporting. Each Participating Agency and the Data Integration Hub can only use Data for the authorized Data Sharing Request purpose and no other purpose in accordance with applicable federal and state law, the P20Win E-MOU and the Data Sharing Agreement. No participating agency, person or entity can maintain, use, disclose or share Data in a manner inconsistent with the terms of the P20Win E-MOU or its Appendices and applicable federal and state laws and regulations. Upon completion of the approved project in a Data Request, all Data must be destroyed as determined by the Participating Agencies in the Data Sharing Agreement. In accordance with applicable law and the P20Win E-MOU, the Data Governing Board establishes the parameters for Data Transmission and Data destruction.
External users of the SRR program can submit files via HTTPS upload utilizing their account credentials from a Web browsing session.
To access the CT.GOV Secure Transport SIGN IN page, open a Web browser (Mozilla, IE 11, Edge, Safari or Chrome are all compatible with Secure Transport) and navigate to:
https://sft.ct.gov/
From the SIGN IN Page, user enters their account name and password and clicks “Sign In”
Please note that BOTH “User Name” and “Password” values are CaSE sEnSiTIVe.
Because we had set the “Require user to change password on next login” when the account was created, the User will be prompted to change their password upon their first login.
Passwords must be at least 7 characters in length and contain at least 1 Alpha and 1 Numeric character.
Upon successful authentication, the user is brought to their “home” folder. From this screen, users can:
Example of file browse, upload, view, and deletion
The file name appears next the the “Browse” button, indicating that the file is in focus to be uploaded. If you selected multiple files to upload, you will see “x files”, x indicating the total number of files you have selected. You can click “Upload File” to upload the file from your local system to your home folder in the Secure Transport environment.
Example of a successful file upload for an SRR account.
DO NOT close the tab or browser window as this will terminate your Secure Transport Session abnormally.
Terminating your browser session without using the Sign Out method in Secure Transport will void
your session cookie, requiring you to completely close your web browser, re-launch the browser
and re-authenticate to Secure Transport.
To download a file from your home directory, select the file you wish to perform the action on by enabling the check mark next to the file and click “Download”. Depending upon your browser you may be prompted to View the file or Save the file to your local system. Again, depending upon your browser, your file may be saved by default to a “download” directory in your user profile on your local system.
It is recommended that users are fully familiar with the “download” process for their particular browser and cognizant **
of the location on their local system where browser downloads are stored.**
To delete a file from your home directory, select the file you wish to perform the action on by enabling the check mark next to the file and click “Delete”.
Please note that the Delete method is immediate, and does not prompt you to confirm deletion.Please use the file
‘delete’ method carefully. You may only select and delete 1 file at a time.
Also note that by default, files uploaded to your home target directory are automatically purged from the Secure
Transport system after 60 days.
Business Program owner(s) | 200 Folly Brook Boulevard, Wethersfield, CT 06109 Phone: 860-263-6281 Andrew Condon June 29, 2022 |
Delegated Administrators: | |
Production Environment | Secure Transport 5.3.0 |
Administration: | https://159.247.3.180:444/coreadmin/auth/login.jspx |
User Upload target: | https://SecureTransport.ct.gov |
Staging Environment | Secure Transport 5.3.0 |
Administration: | https://159.247.3.179:444/coreadmin/auth/login.jspx |
User Upload target: | https://SecureTransport.ct.gov |
Support & Incident Escalation | Secure Transport 5.3.0 |
DOL Service Desk: | https://dol-ap0141/MRcgi/MRentrancePage.pl |
DOL Contacts: | Jackie Russo |
BITS Service Desk: | |
BITS Contacts: | |
Additional Resources | |
AXWAY On-Line Help: | https://159.247.3.180:444/help/Default.htm |
All policies and actions of the Data Governing Board shall further the priorities outlined in the P20 WIN Learning Agenda. The Operating Group, with the Participating Agencies, will support the continued progress in each area of focus: college and career success, student readiness, financial aid, workforce training, and overcoming barriers to success. These areas have a broad scope that should allow for a wide array of research projects and involve many of the participating agencies.
The Learning Agenda is subject to change as state and agency data needs evolve; therefore, the Data Governing Board may propose changes to this document for the Executive Board to approve. Agencies are also encouraged to develop agency-specific learning agendas. The agenda can be updated at any time, but the Executive Board conducts an annual review of the agenda at their November meeting.
The purpose of this research is to provide information to support course placement decisions at colleges and universities and to provide information to families in Connecticut about the probability of admission to four-year institutions.
It is essential to provide multi-faceted supports so that all students can achieve the highest levels of academic readiness. Research will focus on three critical educational systems/transitions: early childhood to k-12; elementary/middle to high school within K-12; and high school to post-secondary. It is vital for this research to not just include mainstream education data, but also to include social services, child welfare, housing, family life, and adult education data to get a fuller picture of student experiences.
Connecticut must have a better understanding of the dynamics of financial aid and the outcomes of state financial aid grant recipients so that we can maximize the opportunity for students with limited state resources.
The success of the state’s workforce education training system, which is critical to the state’s economic development, requires using data to inform decision-making and programming. The state is working to develop standards around measuring the return on investment (ROI) for Connecticut’s public workforce training programs.
Embedded in each of the prior topics is the need to help individuals who face barriers to success or who are at risk of falling behind due to conditions such as homelessness or engagement with the child welfare system. P20 WIN has expanded to include state agencies that address social services, homelessness, and child welfare. Establishing these connections allows us to understand the degree to which residents face additional challenges and to develop programs that support these students and move individuals and families into cycles of success.
With support from the Operating Group and the Participating Agencies, the Data Recipient details the intended purpose, identified data content, and security expectations, availability and dependency requirements in the DSA and the Data Sharing Request Form (Exhibit A to the Data Sharing Agreement) which shall address, including but not limited to the following:
Image
If a data requestor needs to communicate with a participating agency questions concerning the data, all communications need to occur in a secure environment. For instructions on how to use the secure email platform provided by BITS, please refer to this link. If you are a data requestor, you can request that a state agency initiates a secure email chain.
Data definitions mean the plain language descriptions of data elements.
Data dictionary means a listing of the names of a set of data elements, their definitions and additional meta-data that does not contain any actual data, but provides information about the data in a data set.
Data elements mean units of information that are stored or accessed in any data system, such as a student identification number, course code or cumulative grade point average.
Data Integration Hub means the entity that conducts data matching for approved data requests. The CT Department of Labor (DOL) is the Integration Hub for P20 WIN.
Data Stewards means the technical staff at Participating Agencies with knowledge of the available datasets.
Data Governing Board means the board responsible for creating and enforcing policies that support the data sharing process. Each Participating Agency has a representative of the Data Governing Board.
Metadata means the information about a data element that provides context for that data element, such as its definition, storage location, format and size.
Operating Group means the entity serving as the administrative lead agency responsible for the operations of P20 WIN. The CT Office of Policy and Management (OPM) is the Operating Group for P20 WIN.
Participating agency means any entity that has signed the Enterprise Memorandum of Understanding for participation in P20 WIN and has been approved for participation by all other participating agencies.
Preschool through Twenty and Workforce Information Network or “P20 WIN” means a state longitudinal data system for the purpose of matching and linking data of state agencies and other organizations for the purpose of conducting audits and evaluations of federal and state education programs.
P20 WIN means a state data system for the purpose of matching and linking longitudinally data of state agencies and other organizations for the purpose of conducting audits and evaluations of federal and state education programs.
Data Request Management means the review process for each data request submitted to the system. The Data Request Management process is set forth in the P20 WIN Data Request Management Procedure. No data will be included in adata match for any given Participating Agency unless the given Participating Agency has approved the inclusion of its data and has approved the individual(s) or entities who have authority to access the resulting data set.