File Sharing in eCommerce – Past, Present, and Future
In order to cost-effectively operate at scale, an eCommerce platform needs multiple application nodes. However, with multiple nodes, there must be a mechanism to share certain assets – e.g. product images – between nodes.
In this post, we’ll take a look at past file-sharing approaches, what is most effective today, and the technology that will define the future of file sharing. If you realize that your organization is using outdated systems and methods while reading this, don’t worry: we’ll cover what to do at the end of the post.
Past: Network File System (NFS)
Historically, Sun Microsystem’s Network File System (NFS) has been the go-to system for sharing assets between eCommerce application nodes. Developed in 1985, NFS is a mature, robust solution for sharing files over a network. As of version 4, NFS supports file locking and other critical features needed by most eCommerce applications.
In a standard NFS architecture, one system acts as the server, physically storing the shared files on its local disks. Any number of other systems will then act as client and request access to the files over the network. Those files are typically not stored on the client system, however; changes to files are sent across the network back to the NFS server.
While NFS scales well to handle many clients, its primary drawback is that it relies on a single server to serve all requests. If that server is unavailable due to capacity issues or required maintenance, then all shared assets are unavailable.
This can cause major issues for an eCommerce site. Product images won’t display, shared sessions and caches will be unavailable: while the site may be online, customers won’t be able to use it and the experience will certainly suffer.
There are solutions to NFS’s single point of failure (SPOF), such as NFS-Ganesha, DRDB, virtual IPs, etc. Unfortunately, these solutions are secondary systems bolted on to NFS, adding layers of complexity that make your eCommerce platform harder to manage and less reliable.
In lieu of the older Network File System, we now use GlusterFS (Gluster, for short) to share files between application nodes. GlusterFS is an open-source, distributed, networked file system backed by Red Hat, Inc. It enjoys active development and is built from the ground up to enable a high degree of data availability and scalability.
Gluster is known as a distributed-replicated volume: each peer in the Gluster cluster (say that three times fast!) holds a complete copy of all the data in the volume. Any number of servers, from two to more than two hundred, can be peers in the cluster. When a file is added, changed, or deleted on one peer, that modification is rapidly replicated to every other peer, ensuring data consistency. Cluster peers can also be clients, both storing data and accessing it for application use.
No More SPOFs!
Because Gluster distributes data to every peer in the cluster, there is no single point of failure. If a peer goes down, the remaining members of the cluster will continue synchronizing files and delivering data to clients. Once the peer returns to the cluster, it will begin a self-healing process to synchronize with the rest of the cluster. Clients are configured to contact the cluster through multiple points and after the initial connection can communicate with every peer in the cluster.
It’s always fun to look over the horizon and see what systems and optimizations are out at the cutting edge of eCommerce hosting and file sharing. SquashFS is a very interesting system out at the forefront of the space. It is a compressed read-only filesystem that speeds up sharing and enhances security. SquashFS can compress whole file systems or single directories, write them to other nodes, and then mount them directly in memory. This ensures extremely fast data transfer speeds that exceed those of SSDs.
UnionFS is another system that combines the value of both SquashFS and GlusterFS systems. UnionFS allows for files and directories of separate systems (“branches”) to be transparently overlaid, forming a single coherent file system. The different branches may be either read-only or read-write.
For example, a merchant can use UnionFS to lay a read-write media server utilizing GlusterFS over a read-only application server running SquashFS. This keeps the application secure while content on the media server can easily be modified all within a single, coherent filesystem.
Moving Your Infrastructure Forward
Reading this, you may realize that your filesystem is not as advanced as it could be. These systems are more important than most think: file transfer speed and security form the backbone of an engaging, successful digital experience.
If you have any questions or concerns, feel free to give us a call. From infrastructure reviews to hosting client applications in our state-of-the-art data center in Chicago, the experts on the LYONSCG Application Hosting team are here to help.