The majority of organizations have multiple types of data to store. Data size, speed of access and application priority dictate what type of storage different data needs. As a result, many organizations use multiple different storage types in the data center rather than one homogenous storage type.
Two major forms of data center storage are the storage area network (SAN) and network-attached storage (NAS). A SAN uses a network hardware fabric and switches to connect servers to storage. SANs are good for block I/O and structured data, such as relational databases. SANs require either Fibre Channel networking or Ethernet, such as iSCSI. NAS, on the other hand, accesses files with a protocol and is optimal for remote file serving. NAS operates as a server with its own file server and provides centralized data management. It’s best for unstructured data.
Hybrid storage arrays
Hybrid storage arrays join various types of storage together, mixing flash, hard disk drives (HDDs), tape, and object- and cloud-based storage into a single storage infrastructure. Data types, like live data, file server data, streaming data and virtual systems, often have different storage requirements. A hybrid storage array can bring the speed and low latency of flash to the table but still offer the flexibility and lower costs of HDDs, tape and cloud. However, hybrid storage is more complex than an all-flash or all-hard drive system.
A hybrid storage array usually requires tiering software. This software helps organize data into different tiers in the storage system based on factors such as activity, throughput requirements and redundancy and, therefore, helps determine where specific data lives within the system.
The process of storage virtualization can help organizations to host more than one storage array type and better predict storage costs. Many vendors offer software management tools that can help with virtualizing storage, including Flexify.IO, Nutanix AOS, StarWind Virtual SAN and DataCore SANsymphony.
Different storage virtualization tools can virtualize hardware storage arrays, create virtualized storage over hyper-converged infrastructure or specialize in cloud-native storage. In addition, some tools support different storage types; certain tools only work with block storage, and others work at the file level. Consider storage type, availability and management tool use.
When considering different virtual disks, particularly VMware, select among raw, thin and thick disks. A raw disk connects a storage logical unit number directly to a VM within a SAN. A raw disk stores a VM’s disk data on a small disk descriptor file on that VM’s working directory and improves I/O application performance. A thick disk, meanwhile, can boost performance and security by using thick provisioning to pre-allocate physical storage. Finally, a thin disk optimizes disk efficiency by only consuming the amount of disk space it requires in order to function.
Data lake vs. data warehouse, cloud vs. on premises
A data lake is a large repository for holding raw data in its native format. Compared to a traditional data warehouse, which stores data in hierarchical tiers, a data lake stores data as files or objects in a flat architecture.
Both data lakes and data warehouses require a great deal of storage, especially for a large organization. Many storage vendors offer specialized products for each storage architecture. For example, Dell EMC’s Elastic Data Platform and Hitachi Vantara suit large-scale, on-premises data lake deployments. Vendors like IBM, NetApp and HPE have offerings that can work with either architecture style. Meanwhile, cloud vendors such as AWS, Microsoft and Google Cloud Platform offer storage as a service in either architecture.
Storage for containers
Because of containers’ inherent agility, application developers working with them require persistent storage for the containers they deploy.
Container architectures require three types of storage: image storage, a data store for container management and container application storage. Data centers can store images and container management data with existing shared storage architectures. Container application storage, however, requires a specific system data volume — or persistent volume — in the container’s namespace to give the container direct access to read or write into a host system directory or file share.