Archive: January 22, 2023

<<< January 21, 2023

Home

January 23, 2023 >>>


Hello CUDA: coding - GPU series #2

Sunday,  01/22/23  04:18 PM

Continuing the series about CUDA and GPU acceleration, let's talk about CUDA coding.  (Next post here)

How does it work?


To start, let's take a quick example.  Let's say we have two big arrays, A and B, and we want to create a third array C, such that for each element c = a2 + b.  This is a vastly parallel problem, perfect for GPU acceleration.  Here's how this could be done in a vanilla C program:

The main() function is pretty simple; it allocates the three arrays using the standard C runtime function malloc(), initializes the first two arrays A and B, then calls a function called domath() to do the actual computations into array C.  And then it deallocates the arrays using the standard free() function.

The domath() function consists of a simple loop through all the array elements.  It calls a subfunction called mathalgo() which performs the actual computation. 

This vanilla C program is single-threaded; all the computations are executed on a CPU core in a serial fashion.  Let's see how this could be accelerated.  Here's the same logic implemented in a "chocolate" CUDA program:

There are relatively few changes and poof! we have a CUDA program.  At the top, we included two new headers which declare the CUDA API.  The main function now has a scope qualifier, __host__,  which designates the function to be Host code.  This is the default, but for this example we made it explicit.  The storage allocation calls to malloc() have been replaced by calls to cudaMallocHost().  They work the same way, allocating a region of storage, but the storage is accessible to the Device as well as the Host.  We'll talk about memory in more detail a bit later.

The domath() function is now called with two extra parameters preceding the usual ones, enclosed by triple angle brackets.  This is a CUDA extension for "invoking a kernel", that is to say, for calling a function from the Host which will run on the Device.  This call is followed by a new call to cudaDeviceSynchronize(), which halts the CPU thread until [all of the] GPU threads have completed.  And then the storage is deallocated with cudaFreeHost() instead of free().

Moving up to the domath() function, you'll note it has a scope qualifier of __global__.  Global functions run on the Device, but are callable from the Host.  Under the covers there is a driver call from the CPU to the GPU which passes the bytecode of the kernel and causes it to be executed.  The mathalgo() function has a scope qualifier of __device__, which means it is Device code and only callable from other Device code.

Let's go back and look at those new parameters on the domath() call.  They define the number of parallel processing threads to be created on the GPU.  We'll talk about those in more detail a bit later, but for now, just know that this creates 1,024 x 64 = 65,636 threads!

The loop in the domath() function has been changed a bit too.  If we didn't change it, we'd simply execute the same loop 65,536 times.  Instead we want to break up the processing into 65,536 pieces.  The CUDA global data items threadId and blockId define unique thread and block Ids for each thread.  Using these values, we can give each thread a different starting point - the value of index - and then step through the array to bypass all the other threads - the value of stride.  So thread #100 processes array element 100, then 100 + 65,536, then 100 + 2 x 65,536, and so on.

With these changes we've done two things - we've offloaded the computation to the GPU, and we've organized the processing to compute 65,536 array cells in parallel.  This is the essence of CUDA coding.

In the next installment of the series we'll break this down into even more detail ... with even more examples ... please stay tuned.

 

 

Sunday,  01/22/23  05:23 PM

Another lazy day, more CUDA blogging, and more football watching.  But also got a chance to go sailing with my granddaughter Orionna, always excellent.  (Thought it must be said, it was not warm.)

Speaking of not warm, a gratuitous pic of an Iditarod musher.  The "greatest race" starts on March 4, and is [hopefully] back to normal.  Can't wait.

BTW, grinchy comment, the Iditarod website is embarrassing.  It looks like someone's high school kid threw it together back in 1995.  C'mon.

Good news: Meryl Streep joins the cast of Only Murders in the Building for Season 3.  The real good news here is that there will be a Season 3, and it's imminent. 

Important info: all Marvel Movies ranked (and where to stream them).  I've never heard of many of them - and am not a Marvel comics person - but quite a few of these were excellent. 

I think I liked the ones which don't rely on pre-knowledge of the Marvel universe best, e.g. Black Panther.  YMMV.

Interesting, .NET now has a native compiler.  Only took 20 years.  When .NET was created the big rival was Java, and Java compiled to bytecode for a reason - the tagline was "write once, run anywhere".  Java never lived up to that, but .NET never had to, it only ever ran under Windows on Intel architecture machines.  I never could figure out any advantage to bytecoding. 

This is hilarious: Ukraine posts a video to reclassify American tanks as "recreational vehicles".  Love all the RV commercial tropes, the boy scout, the cowboy scenes, American nationalism.  Genius. 

BTW how remarkable that a government at war posts a Tweet lobbying another government to sell them arms.  The world has changed.

But some things have not: Henry Kissinger: Why I changed my mind about Ukraine.  "Before this war I was opposed to the membership of Ukraine in NATO because I feared it would start exactly the process we are seeing now, but the idea of a neutral Ukraine in these conditions is no longer meaningful."  He's 99 and still sees around the corner. 

Instapundit wonders, is that even allowed? - LA Times prints story that admits California storms not caused by climate change.  You mean we've had storms all along?  Who knew? 

Can we blame Al Gore?  "Speaking at the World Economic Forum's annual wankfest in Davos, Switzerland, the inventor of the internet and the scourge of massage therapists everywhere went on an unhinged rant that tells you all you need to know about the psychosis currently afflicting politicians all over the world.

Meanwhile: California lost 10M ballots in 2022.  Huh.  I'm a Californian and I vote by mail. 

The ARS Technica Rocket Report.  Includes SpaceX of course - that's a Falcon 9 launch at left ... - but many other companies are space-ing out too. 

I'm sure you've heard, Twitter are now enforcing their API rules to prohibit third-party clients which are competitors; Dave Winer reports it still works for other purposes.  I think the communication was awful but it was only a matter of time before this happened. 

Meanwhile: Mastodon has a lot of new clients; John Gruber shares the obstacle to more isn't Mastodon but Apple.  "Mastodon’s explosive growth in the face of Twitter’s collapse has made it a new UI playground, especially so on iOS."  

For me Mastodon is still a wanna-be.  When Ukraine starts posting propaganda videos there, then we'll consider them an alternative...

And some important news: the absolute best way to make sugar cookies.  A worthwhile investigation, and well reported.  I was not asked to help taste, but would be happy to volunteer next time. 

 

 

recent ancient history

Sunday,  01/22/23  09:26 PM

I restarted blogging after a two-year gap back in December, and so I had a two-year backlog of interesting stuff to share.  (Suitably filtered for after-the-fact relevance :)  I've done it in three tranches, recent history (pre Covid through mid-2020), ancient history (mid-2020 through mid-2021), and now this one, recent ancient history (some old stuff and then mid-2021 through end-2022).  I can't explain the naming either.

This was done while watching football; today for example I watched the delightful slugfest between the Cincinnati Bengals and Buffalo Bills held in the snow of Buffalo.  Playoff football as it should be.  Yes of course my space heater is on :)

Sept 2020: Belarus, once a startup magnet, faces a tech exodus.  This has only become more true in the interim.  I managed a team headquartered in Minsk; they were the greatest people as well as strong hardworking engineers.  I feel for them, not only because of what has happened there, but the impact of the Russian invasion of Ukraine.  Easy to forget that our problems are little one. 

Sept 2020: Still a key issue: Tech's next big task: taking the office water cooler online.  Now that it has become apparent that hybrid work is the new normal, the lack of informal communication remains a drawback. 

Nov 2020: xkcd: Ten Years.  My favorite of all time, amid heavy competition.  I know you can't read it here; please click through and enjoy. 

Dec 2020: Tim Carmody: Long Overdue.  "It’s been a long time. So, first I'll ask: are you well? What’s changed for you since I last wrote?  And the last is the most unusual one, although maybe it should not be so unusual from now on: Have you lost anyone?"  Wow.  Please read and reflect. 

Aug 2021: news you should use: How to properly cut and serve different cheeses.  I'm available if you'd like to practice :) 


Sep 2021: Maximally interesting: the 100 most spoken languages.  Another must-click-through.  What do you think is #1?  Where do you think English ranks? 

Oct 2021: the Larkin Poe cover of Dazed and Confused.  I promise this will have you from the first note. 

Oct 2021: Chris Dixon links Hack Butcher: Composability is to software, as compounding interest is to finance.  I thought it was an interesting analogy so I saved it, but now too it's interesting also because Chris Dixon joined VC firm A16Z and has been among the most vociferous supporters of crypto and "Web3".  I love reading his over-the-top defenses and withering rebukes to critics.  All in the schadenfreude file. 

Jan 2022: Molly White: "Blockchain-based systems are not what they say they are".  Correct.  Chris, meet Molly. 

Nov 2022: Tim Bray: AWS and Blockchain.  "I'm not prepared to say that no blockchain-based system will ever be useful for anything. But I’m gonna stay negative until I see one actually at work doing something useful, without number-go-up greedheads clamped on its teats.

Dec 2021: Maggie's Farm considers SAT tests, which links You Aren't Actually Mad at the SATs.  A proxy for the whole academic system of testing, grading, exams, etc.  It will be most interesting to see whether the pendulum swings back on all this.  Let's hope so. 

Jan 2022: Making the web better, with blocks!  I have mad respect for Joel Spolsky and saved this thinking he was on to something, but a year later, nada.  Software composability is as difficult to find as compound interest :) 

Feb 2022: Tim Bray: Google Memory loss.  It's a true thing.  And evidence increasingly seems to indicate it's a selective memory loss. 

Mar 2022: Dave Winer ponders Evolution in Software.  "The general rule is this, you can't go back in time and redo a decision. What's done is done.

When we discover intelligent life on another planet, the will have computers; CPUs, memory, etc.  They will not have Unix.

Mar 2022: I kept this because it's interesting, and especially for my friend Daniel Jacoby: Ernest Shackleton’s Ship Found After 106 Years.  The underwater footage is amazing! 

Apr 2022: Dave Smith: This exchange continues to haunt me.  Speaking for millions of people: "Thanks Dave"! 

Oct 2022: new Elon Musk essay (in China Cyberspace magazine):Believing in technology for a better future.  "As technology accelerates, it may one day surpass human understanding and control. Some are optimistic and some are pessimistic. But I believe that as long as we are not complacent and always maintain a sense of urgency, the future of humanity will be bright, driven by the power of technology.


Jul 2022: The first astonishing image from the James Webb Telescope.  "It is the deepest image of our universe that has ever been taken."  Wow.  Many more to follow! 

Dec 2022: NASA: 50 years ago: Apollo 17.  "Not long after midnight on Dec. 7, 1972, the last crewed mission to the Moon, Apollo 17, lifted off with three astronauts: Eugene Cernan, Harrison Schmitt, and Ronald Evans."  50 years ago.  Wow. 

Dec 2022: Boing Boing: Age-of-sail style map of the Mendelbrot Set.  Beautiful!  As is the set itself; it continues to be an infinite source of awe. 

And we're up to date!  Yay.  And Onward...

 

 
 

Return to the archive.

Comments?