Dark data

What is dark data?

Dark data refers to all the unused, unstructured data stored in an organization for compliance or forensic investigations in the future. Gartner defines dark data as "the information assets organizations collect, process, and store during regular business activities, but generally fail to use for other purposes." A few instances of dark data include server log files, ex-employee information, email records, older versions of files currently in use, and more.

Dark data statistics


60 percent of more than 1,300 global businesses said that more than half of their data was dark in 2019.


32 percent of respondents cited lack of resources as a hurdle for retrieving dark data.


Why is dark data important?

  • Gain untapped insights

    Use dormant business data to mine essential insights and patterns in internal processes and customer correspondences for continuous improvement. Organizations could lose to competitors by ignoring important information hidden within dark data.

  • Spot security vulnerabilities

    When the bulk of your data is not securely stashed away, it is vulnerable to leaks and thefts. It's extremely easy for hackers to gain access to data with public access and to systems using outdated software components. Therefore, it is important to know if any business-critical information is within your dark data stores.

  • Optimize data storage

    Old and stale files that are no longer in use should be pruned from your storage ecosystem. The cost of storing stale files and other redundant data is quite high. Use a redundant, obsolete, and trivial (ROT) data calculator to see how much you can save. Calculate now.

What is dark data management?

Dark data management includes the overall governance of underutilized data in an organization. A framework of how the data will be collected, processed, utilized, and disposed of is given attention to in this process. With proper management of dark data, organizations can improve their customer experience, understand market trends, and strengthen their business operations to develop and maximize business profits in their ecosystems.

Types of dark data

Based on how it is used and where it is sourced from, dark data is categorized as:

  • Redundant data: Data which is copied multiple times intentionally or unintentionally in files.
  • Unused data: Data that is collected during business interactions but is left unprocessed in file repositories.
  • Spatial data: Data collected by IoT devices, such as sensors and processors that interact with other electronic devices.
  • Unstructured data: Data that is raw, left uncategorized based on certain criteria, and can’t be used for decision-making.
  • Meta data: Information or secondary data which describes or adds context to the primary data, such as the format, the source, and the time at which the data was collected.
  • Syndicated data: Datasets captured from third-party firms and not as part of an organization's business operations.

Security concerns over dark data

Dark data can be vast in size and often isn't secured by organizations. By not monitoring dark data, businesses face the following risks:

  • Exposure or breach of sensitive data such as personally identifiable information (PII), confidential business data, and customer payment information left unidentified in your repositories.
  • Unsecured access to internal data such as log files, passwords, and previous employees' information, which can be leveraged by malicious insiders and external hackers to carry out data theft or breaches.
  • Older file versions or software versions can enable hackers to create a backdoor into the organization's network.
  • Unstructured data that is not monitored for sudden spike in file activities can result in ransomware and other malware infections going undetected.

Dark data analysis with DataSecurity Plus

DataSecurity Plus offers a multi-faceted approach to deal with dark data. With DataSecurity Plus, you can locate, audit, and secure sensitive files and folders along with other business files. Equip your IT infrastructure with the right tools to get the most out of the dark data in your file servers.

With data visibility and security functions, you can:

Classify and secure sensitive data

Spot and classify critical information such as PII, electronic protected health information (ePHI), customer payment information, and more using the file tagging capability in the data discovery tool. Keep up with external regulatory mandates using periodic reports on sensitive data storage, and stay on top of usage with data risk assessment software.

Manage junk data

Maintain optimum space in your repositories with a storage analyzer, which can pinpoint all ROT data. Periodically review old and stale files to prevent them from being misused by malicious insiders or hackers.

Examine file permissions

View reports on security permission violations like files with open access or inconsistent permissions with the security permissions analyzer. Get real-time notifications to thwart potential file exposure and leaks by overseeing security permission changes with the NTFS permissions auditing tool.

Audit file and folder activity

Keep track of all file events to monitor which files are used and when with the file activity monitoring tool. Be instantly notified of mass file deletion or security permission changes that may indicate a ransomware attack or a hack on your systems.

Try all of these features and more with a free, fully functional 30-day trial.

Download your trial
Email Download Link