Data Subset Modernization Guide
Data Subset is a testing product by Grid-Tools. Explore technical details, modernization strategies, and migration paths below.
Product Overview
Data Subset is a tool designed to create smaller, representative subsets of production data for use in testing, development, and training environments.
It typically runs on z/OS and may require specific subsystems depending on the data sources being accessed.
Modernization Strategies
Rehost
- Timeline:
- 6-12 months
Lift-and-shift to cloud infrastructure with minimal code changes. Fast migration with lower risk.
Refactor (Recommended)
- Timeline:
- 18-24 months
Optimize application architecture for cloud while preserving business logic. Best ROI long-term.
Replatform
- Timeline:
- 3-5 years
Complete rewrite to cloud-native architecture with microservices and modern tech stack.
Frequently Asked Questions
General
What does Data Subset do?
Data Subset is a tool designed to extract smaller, representative portions of production data for use in testing environments. This allows organizations to test applications with realistic data without exposing sensitive information or overwhelming test systems with the full production dataset.
Is this a system, application, or tool?
Data Subset is a specialized toolset focused on data extraction and manipulation for testing purposes. It is not a complete system, application, framework, or middleware, but rather a utility that supports those broader systems.
What types of organizations use this?
Organizations that require robust testing of applications against realistic data, particularly those dealing with large production databases, benefit from Data Subset. This includes companies in regulated industries like banking, insurance, and healthcare where data privacy and compliance are critical.
When should we consider Data Subset?
A company should consider Data Subset when the need arises to create smaller, manageable datasets for testing, development, or training purposes. This is especially relevant when dealing with sensitive data that needs to be masked or when full production datasets are too large to be efficiently used in non-production environments.
What are the alternatives to Data Subset?
Alternatives to Data Subset include manual data extraction and masking processes, other commercial data subsetting tools, and custom-built scripts. Competing products include IBM Infosphere Optim Test Data Management Solution and Delphix.
Technical
What infrastructure is required?
Data Subset typically runs on z/OS and may require specific subsystems depending on the data sources being accessed. It is often deployed on-premise, close to the production data it needs to extract.
For mainframe products: Does this run in an LPAR?
For mainframe products, Data Subset runs in an LPAR and is z/OS dependent. It may require specific subsystems like DB2 or IMS, depending on the data sources being accessed.
What configuration files are used?
Data Subset often uses configuration files to define data extraction rules, masking policies, and connection parameters. These files specify which tables and columns to include in the subset, how to mask sensitive data, and how to connect to the source databases.
Does Data Subset have APIs?
Data Subset may expose APIs for integration with other testing and development tools. These APIs could be used to automate the data subsetting process, trigger data refreshes, or integrate with CI/CD pipelines.
Business Value
What business problem does Data Subset solve?
Data Subset solves the business problem of providing realistic test data without compromising sensitive information or overwhelming test environments. By creating smaller, representative subsets of production data, organizations can improve the efficiency and effectiveness of their testing processes.
What would happen if an organization did NOT use this product?
If an organization did not use Data Subset, they would likely rely on manual data extraction and masking processes, which can be time-consuming, error-prone, and potentially expose sensitive data. Alternatively, they might use full production datasets in test environments, which can be inefficient and resource-intensive.
How does Data Subset provide business value?
Data Subset helps organizations accelerate testing cycles, reduce the risk of data breaches, and comply with data privacy regulations. By providing realistic and secure test data, it enables faster and more reliable software releases.
Security
How does Data Subset protect sensitive data?
Data Subset employs data masking techniques to protect sensitive information in the extracted subsets. This may include techniques like data encryption, data redaction, and data substitution.
How does Data Subset integrate with security systems?
Data Subset often integrates with existing security systems to control access to the extracted data subsets. This may involve integration with LDAP directories or other authentication providers.
Does Data Subset provide audit logging?
Data Subset may provide audit logging capabilities to track data extraction and masking activities. This allows organizations to monitor compliance with data privacy regulations and identify potential security breaches.
Operations
What level of technical expertise is required?
Implementing Data Subset requires technical expertise in data extraction, data masking, and database administration. Ongoing operational requirements include monitoring data extraction jobs, maintaining data masking policies, and ensuring the security of the extracted data subsets.
What are common implementation challenges?
Common implementation challenges include identifying sensitive data, defining appropriate data masking rules, and ensuring the consistency and accuracy of the extracted data subsets.
What administrative interfaces are available?
Data Subset may provide administrative interfaces, such as a command-line interface (CLI) or a web console, for managing data extraction jobs, defining data masking policies, and monitoring system performance.
Ready to Start Your Migration?
Download our comprehensive migration guide for Data Subset or calculate your ROI.