GetHPCOS

Do you need HPC-optimized OS?
Is HPC software the missing piece? Join the discussion below!
Cray Inc.
What does the underlying OS really mean to you?
Sunny
To Cray, the underlying OS means the entire operating environment: the kernel, OS services, daemons, and other software that provides user services. It also includes integrated interfaces to third party components like workload managers.
Joseph B George (JBG)
Right - it's direct access to system resources, and being able to work with the OS, and sometimes enhance the OS (user mode or otherwise) allows our applications to result in better performance
Joseph B George (JBG)
And that's why we're all doing this - better performance! :)
Sunny
I see Scott from Altair in the crowd. Do you have a comment, Scott?
Martijn de Vries
once containerized workload becomes more mainstream in HPC, the actual OS that is running on your nodes will become less relevant, since everything needed for your jobs to run would be in the container image.
Sunny
Yes, we see workloads requirements becoming more diverse and containers being an important part of supporting them.
Sunny
Container use is increasing on HPC workloads.
Scott Suchyta (HPC)
Agree with Martijn. The workload needs to be orchestrated such that jobs will run on the right nodes at the right time -- job schedulers will be critical in the stack
Joseph B George (JBG)
Yes, the applications are evolving quickly - our communities are starting to think through workload management and container orchestration
Tom Joy
@Scott_HPC do we have a scheduler that support containers? I am not sure how its identify the containers in a node...
Scott Suchyta (HPC)
@Tom, #PBSPro supports containers. The container is part of the request for a job, and PBS will deploy the predefined container image to the node(s).
Joseph B George (JBG)
+1 to @Scott_HPC - great that Altair is working this
Sunny
Cray is working with Altair and others on the coming mash-up of container, WLM, and orchestration technologies that serve very broad workload requirements. We might throw provisioning in their too.
Tom Joy
@Scott_HPC that means containers has to be mentioned in resourcedef ?
Scott Suchyta (HPC)
#PBSPro 18.x release simplifies the container integration for sites. From user pov, requesting a container is an environment variable, qsub ... -v CONTAINER_IMAGE=name_of_container
Scott Suchyta (HPC)
From admin pov, you can create custom resources to target specific nodes that are eligible to execute the request container
Cray Inc.
Is there value in packaging things together?
Joseph B George (JBG)
There is great value in packaging and optimizing the whole environment, which is what Cray uniquely does. The user gets a complete integrated distributions and does not have to build the environment from component parts.
Martijn de Vries
it's valuable to be able to deploy a tuned and flexible setup so that the wheel does not have to be reinvented every time a system gets deployed
Piush Patel
it will result in a more stable environment and makes things easier to support on mission critical compute infrastructure
Joseph B George (JBG)
Agree - we find that, generally, administrators spend a lot of time focusing on maintenance and keeping the machines running - the more we can keep things flexible, the more customers can focus on innovation and solving key challenges
Scott Suchyta (HPC)
@jbgeorge agree! admins spend a lot of time making sure all of the moving parts are working together. It really sucks when a component changes and breaks three other components were depending on it.
Joseph B George (JBG)
Agree @Scott_HPC - it might be ok to do that putting your presents on Christmas morning (I've been there!), but never want to do that with your HPC system!
Yevgeniya Perederey
@jbgeorge What are the requirements to minimize OS noise and improve system performance? #gethpcos
Joseph B George (JBG)
A common question! A number of system functions run through the operating system, so it can be an area of overhead, but also a place to drive efficiency- some include things you can in the system as a whole, others are things you need to do in the kernel
Joseph B George (JBG)
One simple way to minimize OS noise is to examine the different types of nodes that exist in your system - some are job-running compute nodes, some are service nodes - not all nodes require the same level of OS enablement!
Joseph B George (JBG)
For example, one question we asked at Cray was "does a compute node need EVERYTHING in Linux to perform optimally?" The answer was no - so we've managed to drive efficiency into the compute node Linux, keeping more resources free for better application performance
Sunny
At the user level, users can start with the normal things they do to improve process and MPI rank synchronization at the application level. That helps.
Sunny
But after that, you need an operating environment that is composed to do this.
Alison Paisley
@jbgeorge Different but related...what does allocating jobs to specific nodes do?
Joseph B George (JBG)
Great question - if you think about how an HPC cluster is architected, you can have various nodes types throughout, varying from processor types, newer models of servers, etc
Joseph B George (JBG)
Some nodes may have a better memory profile - better suited for memory intensive applications. Some nodes may have newer processor types and some applications can drive better performance. Being able to specify the nodes the job runs on means a better app run - which is huge.
Piush Patel
How does the collaboration between Cray and your partners' engineering teams (on an optimized HPC OS like CLE) benefit customers?
Sunny
Our engineering teams collaborate to create a seamless integration between CLE and SW like PBS Pro.
Sunny
The resulting collaboration produces better scalabilty, performance, quality, and reliability.
Joseph B George (JBG)
IMHO the ecosystem is critical to see progress - there are a variety of use cases, various permutations of applications + mgmt. software + processor types + locations (cloud, on prem, etc) - collaboration between partners is critical - and the customer benefits the most
Joseph B George (JBG)
Your perspective, Piush?
Scott Suchyta (HPC)
Strong collaboration also results in shorter time to market. Partners don't have to wait for @cray_inc to deliver a feature and then the partner starts working on the integration. Very important for customers wanting bleeding edge solutions
Piush Patel
agree with scott and sunny!
Paul Rosien
What can be done to reduce system jitter? #GetHPCOS
Joseph B George (JBG)
Great question - and for those who are not familiar with the term, jitter can be seen as latency in the system. Addressing jitter can result in far better overall performance of the jobs. Re: what can be done...
Joseph B George (JBG)
There are some things you can do, like ensure you're using a high speed interconnect vs something more standard. However, there are other things that you need to jump into the kernel and modify to reduce jitter
Joseph B George (JBG)
At @cray_inc, we have found that tweaking internal process synchronization and memory utilization, as well as tighter execution paths in the OS, have been great to helping this
Scott Suchyta (HPC)
To @HPC_sunny comment about using @cray_inc programming environment. What is Cray's view on using commercial and/or open source in the HPCOS? Assume obvious response.. right tool for job, but is there other criteria for selecting the tools?
Sunny
:-) Since we build on Linux, we already do, Agree with your comment. Beyond the distros....
Sunny
we do incorporate upstream community /open source code...and also contribute some of it back to the community
Joseph B George (JBG)
Criteria would include does the application adequately support the OS, what are the admins trained on, what are existing tools integrated with, etc
Cray Inc.
Can you provide suggestions for how to improve application scalability?
Sunny
Pay attention to your use of MPI collectives like barrier, reduce, allreduce, and bcast. Ensure you are utliizing libraries that are optimized for HPS. Profile your code using tools in Cray PE.
Sunny
The Cray profiler can analyze 100k MPI ranks.
Martijn de Vries
depends on whether your application just needs to do number-crunching (in which case optimizing MPI is the answer). here's a thought if it's not just number-crunch: develop it like a cloud-native application using 12 factor design principles
Joseph B George (JBG)
And from an OS perspective, 1) the notion of deploying jobs to the best nodes for the workload (memory, processor, etc) helps a lot and 2) using lightweight OSs at the compute node, both help immensely in scalability and performance
Scott Suchyta (HPC)
@jbgeorge sounds like you need a job scheduler to figure out the when and where ;-)
Joseph B George (JBG)
And to Sunny's point, there is a lot you can do with a flexible programming environment
Cray Inc.
Is being able to build the images critically important for any particular reason?
Sunny
Having the tools to build and manage images allows you to more easily adapt/re-image portions or all of your system as needed: for example, in response to dynamic workload requirements, without having to reinstall software from scratch. Images can be created independently.
Joseph B George (JBG)
Yes! I like to think in terms of the application - what resources does the application need, how do we get the results faster, etc. Being able to build images helps us maximize this
Joseph B George (JBG)
Others in the audience - your thoughts?
Sunny
You can build images at any time. Create a library for later use.
Martijn de Vries
it's important to be able to switch between images quickly so that your nodes can be tailored for a particular type of jobs.