Suning Cloud Commerce
is one of the largest privately owned retailers in China. Suning has more than 1600 stores covering over 700 cities of Mainland China, Hong Kong and Japan, and its e-commerce platform, Suning.com
ranks among top three Chinese B2C companies. There are more than 180,000 employees, thousands of mixed power, x86, storage servers and tens of thousands of virtual machines from several large data center across China, HongKong and Japan. KVM, oVirt and virtualization technologies are widely used, and there are also very large server farm for VDI.
Suning starts OpenStack journery from Jun 2013, and in 2014, an OpenStack based production environment has been put in use, to server Android Mobile Application and Service, Search Engine, Retailer related business needs, plus critical code repository server. More than half are resource intensive and heavy loaded virtual machines Cinder multi-backend provides extra storage by leveraging LVM and Gluster. A light VDI solution to serve desktop users by leveraging open source spice. Within this deployment, network separated and openvswitch bonding is supported.
The successful story leads to large scale OpenStack deployment, and heat's auto-scaling pretty fits the scenarios for Suning's popular Internet applications. Here we use Heat to assiste 2 kinds of applications:
- Eleastic Web computation. The web cluster that we are using is IHS + WAS, it is one of major web applications that serve tens of million visitors, and the visits changes sharply. We need to adjust the cluster scale based on visits and also CPU load;
- Real-time image thumbnail processing. This requires to generate image and document preview, plus real-time video-transcoding. The distributed scheduling framework is used, Beyond this, we also leverage OpenStack Heat, to dynanically create or destroy workers based on requests. For example, there are concurrent image thumbnail requests, at minimum, there are less than 10 works per seconds to serve the requests. However during peak time, there might be thousands of workers to be created. This demand the clusters to be expand or shrink fully automated and fast response. Once the request volume drop down, the resources should be released immediately to be reused by other kind of jobs.
We will share our story, experience and practice to enable such kind of big cloud and how to support them, best practise and lesson learn.