Scaling Distributed Machine Learning leveraging VMware Bitfusion on Kubernetes with NVIDIA GPUs

VMware Bitfusion extends the power of VMware vSphere’s virtualization technology to GPUs. VMware Bitfusion helps enterprises disaggregate the GPU compute and dynamically attach GPUs anywhere in the datacenter just like attaching storage. Bitfusion enables use of any arbitrary fractions of GPUs. Support more users in test and development phase. Distributed machine learning across multiple nodes can be effectively used for training. This video demonstrates the effectiveness of sharing GPU across jobs with minimal loss of performance. VMware Bitfusion makes distributed training scalable across physical resources and makes it limitless from a GPU resources capability. The solution showcases the benefits of combining best in class…Read More

