*AI Summary*
*# Reviewer Recommendation*
This material is best suited for *Systems Programmers, Graphics Engine Architects, and Game Developers* interested in low-level hardware abstraction and performance optimization. It is particularly relevant for those transitioning from high-level "state machine" APIs (like legacy OpenGL) to modern, explicit APIs (Vulkan, DX12, Metal).
**
### Abstract
In his 2024 Handmade Cities talk, "It's Not About The API," Mason Ramali presents a paradigm shift in how developers should approach Vulkan. While acknowledging that Vulkan requires significantly more boilerplate than legacy APIs—exemplified by a 1,180-line "Hello Triangle" implementation—Mason argues that the API is ultimately simpler because it reflects the actual requirements of modern hardware.
The presentation details a "hardware-first" rendering strategy designed to minimize driver overhead and API surface area. By leveraging techniques such as *Vertex Pulling,* *Bindless Rendering,* and *Draw Indirect,* developers can sidestep the most complex parts of the Vulkan API. Instead of managing hundreds of individual state objects and resources, the engine treats the GPU primarily as a destination for large, memory-mapped data buffers. Mason demonstrates the efficacy of this approach by rendering two million objects at 60 FPS on aging hardware, proving that an explicit, buffer-centric architecture provides both flexibility and high performance for independent engine developers.
**
### Summary of "It's Not About The API - Rendering in Vulkan"
* *0:34 - Background and Context:* Mason Ramali, developer of *Way of Rhea* and Zig Software Foundation board member, discusses transitioning from an OpenGL-based engine to a custom Vulkan-based engine to achieve cross-platform parity and better hardware control.
* *1:59 - Defining Graphics APIs:* APIs are described as communication protocols for GPUs. Despite various vendors (NVIDIA, AMD, Intel) and APIs (Vulkan, DX12, Metal), the underlying hardware functionality is nearly identical, making modern APIs conceptually interchangeable.
* *6:59 - The "Boilerplate" Problem:* A direct comparison shows an OpenGL triangle requires ~40 lines of code, while a Vulkan triangle requires ~1,180. The presenter argues that this "complexity" is actually honesty; the driver used to hide synchronization, memory management, and state validation at the cost of performance and predictability.
* *10:38 - Core Vulkan Concepts:*
* *Physical vs. Logical Devices:* Allows explicit selection of discrete vs. integrated GPUs.
* *Command Buffers:* Exposes the asynchronous nature of GPU execution.
* *Pipelines (PSOs):* Bakes state (blend modes, shaders, depth tests) into immutable objects to prevent expensive state-change validation during draw calls.
* *Synchronization:* Shifts the burden of managing multi-threaded resource access from the driver to the developer for maximum utilization.
* *23:53 - Hardware-First Philosophy:* Design the renderer based on what the hardware is capable of (e.g., SIMD execution, memory access patterns) rather than trying to satisfy the API's specific abstractions.
* *24:57 - Simple Memory Management:* Rejects complex heap allocators or reference counting. Proposes pre-allocating large "arena" buffers per level or world chunk and using simple bump allocation, mapping the GPU memory directly to the CPU.
* *27:33 - Vertex Pulling:* Replaces complex vertex input layouts with a single giant buffer. The vertex shader manually fetches data via indices, simplifying the API surface and allowing for creative data packing.
* *29:41 - Bindless Uniforms and Parameters:* Instead of frequently binding individual uniform buffers, all scene data is placed into one large buffer indexed by the shader. This reduces driver overhead and simplifies the implementation of "Bindless Rendering."
* *31:51 - Draw Indirect:* Instead of issuing individual draw calls from the CPU, draw arguments are written into a buffer. This allows the GPU to consume commands in bulk and enables multi-threaded or GPU-driven command generation.
* *33:46 - Uber-Shaders and Materials:* Mason advocates for a "Material Uber-Shader" using switch statements to handle different material logic within a single pipeline. He addresses the myth of branching performance, noting that SIMD divergence is minimal if the branch is consistent across the rendered object.
* *38:43 - High-Performance Demonstration:* Using the described buffer-centric approach, the presenter demonstrates rendering 10,000 to 2,000,000 objects simultaneously at 60 FPS on a six-year-old laptop, illustrating the efficiency of minimal driver interaction.
* *45:28 - Final Philosophy:* A good API facilitates communication with the underlying system; a bad one obstructs it. Modern explicit APIs like Vulkan are deemed "simpler" for engine developers because they remove the "black box" of driver heuristics.
* *48:00 - Q&A Highlights:*
* *Validation Layers:* Recommended as an essential debug tool that replaces the old `glGetError` methodology.
* *Vulkan Extensions:* Advised to check cross-compatibility via *vulkan.gpuinfo-dot-org* and compare with DirectX features to gauge mainstream hardware support.
* *Learning Resources:* Suggests rewriting tutorials to remove unnecessary C++ abstractions to better understand the raw API.
AI-generated summary created with gemini-3-flash-preview for free via RocketRecap-dot-com. (Input: 36,991 tokens, Output: 1,171 tokens, Est. cost: $0.02).Below, I will provide input for an example video (comprising of title, description, and transcript, in this order) and the corresponding abstract and summary I expect. Afterward, I will provide a new transcript that I want a summarization in the same format.
**Please give an abstract of the transcript and then summarize the transcript in a self-contained bullet list format.** Include starting timestamps, important details and key takeaways.
Example Input:
Fluidigm Polaris Part 2- illuminator and camera
mikeselectricstuff
131K subscribers
Subscribed
369
Share
Download
Clip
Save
5,857 views Aug 26, 2024
Fluidigm Polaris part 1 : • Fluidigm Polaris (Part 1) - Biotech g...
Ebay listings: https://www.ebay.co.uk/usr/mikeselect...
Merch https://mikeselectricstuff.creator-sp...
Transcript
Follow along using the transcript.
Show transcript
mikeselectricstuff
131K subscribers
Videos
About
Support on Patreon
40 Comments
@robertwatsonbath
6 hours ago
Thanks Mike. Ooof! - with the level of bodgery going on around 15:48 I think shame would have made me do a board re spin, out of my own pocket if I had to.
1
Reply
@Muonium1
9 hours ago
The green LED looks different from the others and uses phosphor conversion because of the "green gap" problem where green InGaN emitters suffer efficiency droop at high currents. Phosphide based emitters don't start becoming efficient until around 600nm so also can't be used for high power green emitters. See the paper and plot by Matthias Auf der Maur in his 2015 paper on alloy fluctuations in InGaN as the cause of reduced external quantum efficiency at longer (green) wavelengths.
4
Reply
1 reply
@tafsirnahian669
10 hours ago (edited)
Can this be used as an astrophotography camera?
Reply
mikeselectricstuff
·
1 reply
@mikeselectricstuff
6 hours ago
Yes, but may need a shutter to avoid light during readout
Reply
@2010craggy
11 hours ago
Narrowband filters we use in Astronomy (Astrophotography) are sided- they work best passing light in one direction so I guess the arrows on the filter frames indicate which way round to install them in the filter wheel.
1
Reply
@vitukz
12 hours ago
A mate with Channel @extractions&ire could use it
2
Reply
@RobertGallop
19 hours ago
That LED module says it can go up to 28 amps!!! 21 amps for 100%. You should see what it does at 20 amps!
Reply
@Prophes0r
19 hours ago
I had an "Oh SHIT!" moment when I realized that the weird trapezoidal shape of that light guide was for keystone correction of the light source.
Very clever.
6
Reply
@OneBiOzZ
20 hours ago
given the cost of the CCD you think they could have run another PCB for it
9
Reply
@tekvax01
21 hours ago
$20 thousand dollars per minute of run time!
1
Reply
@tekvax01
22 hours ago
"We spared no expense!" John Hammond Jurassic Park.
*(that's why this thing costs the same as a 50-seat Greyhound Bus coach!)
Reply
@florianf4257
22 hours ago
The smearing on the image could be due to the fact that you don't use a shutter, so you see brighter stripes under bright areas of the image as you still iluminate these pixels while the sensor data ist shifted out towards the top. I experienced this effect back at university with a LN-Cooled CCD for Spectroscopy. The stripes disapeared as soon as you used the shutter instead of disabling it in the open position (but fokussing at 100ms integration time and continuous readout with a focal plane shutter isn't much fun).
12
Reply
mikeselectricstuff
·
1 reply
@mikeselectricstuff
12 hours ago
I didn't think of that, but makes sense
2
Reply
@douro20
22 hours ago (edited)
The red LED reminds me of one from Roithner Lasertechnik. I have a Symbol 2D scanner which uses two very bright LEDs from that company, one red and one red-orange. The red-orange is behind a lens which focuses it into an extremely narrow beam.
1
Reply
@RicoElectrico
23 hours ago
PFG is Pulse Flush Gate according to the datasheet.
Reply
@dcallan812
23 hours ago
Very interesting. 2x
Reply
@littleboot_
1 day ago
Cool interesting device
Reply
@dav1dbone
1 day ago
I've stripped large projectors, looks similar, wonder if some of those castings are a magnesium alloy?
Reply
@kevywevvy8833
1 day ago
ironic that some of those Phlatlight modules are used in some of the cheapest disco lights.
1
Reply
1 reply
@bill6255
1 day ago
Great vid - gets right into subject in title, its packed with information, wraps up quickly. Should get a YT award! imho
3
Reply
@JAKOB1977
1 day ago (edited)
The whole sensor module incl. a 5 grand 50mpix sensor for 49 £.. highest bid atm
Though also a limited CCD sensor, but for the right buyer its a steal at these relative low sums.
Architecture Full Frame CCD (Square Pixels)
Total Number of Pixels 8304 (H) × 6220 (V) = 51.6 Mp
Number of Effective Pixels 8208 (H) × 6164 (V) = 50.5 Mp
Number of Active Pixels 8176 (H) × 6132 (V) = 50.1 Mp
Pixel Size 6.0 m (H) × 6.0 m (V)
Active Image Size 49.1 mm (H) × 36.8 mm (V)
61.3 mm (Diagonal),
645 1.1x Optical Format
Aspect Ratio 4:3
Horizontal Outputs 4
Saturation Signal 40.3 ke−
Output Sensitivity 31 V/e−
Quantum Efficiency
KAF−50100−CAA
KAF−50100−AAA
KAF−50100−ABA (with Lens)
22%, 22%, 16% (Peak R, G, B)
25%
62%
Read Noise (f = 18 MHz) 12.5 e−
Dark Signal (T = 60°C) 42 pA/cm2
Dark Current Doubling Temperature 5.7°C
Dynamic Range (f = 18 MHz) 70.2 dB
Estimated Linear Dynamic Range
(f = 18 MHz)
69.3 dB
Charge Transfer Efficiency
Horizontal
Vertical
0.999995
0.999999
Blooming Protection
(4 ms Exposure Time)
800X Saturation Exposure
Maximum Date Rate 18 MHz
Package Ceramic PGA
Cover Glass MAR Coated, 2 Sides or
Clear Glass
Features
• TRUESENSE Transparent Gate Electrode
for High Sensitivity
• Ultra-High Resolution
• Board Dynamic Range
• Low Noise Architecture
• Large Active Imaging Area
Applications
• Digitization
• Mapping/Aerial
• Photography
• Scientific
Thx for the tear down Mike, always a joy
Reply
@martinalooksatthings
1 day ago
15:49 that is some great bodging on of caps, they really didn't want to respin that PCB huh
8
Reply
@RhythmGamer
1 day ago
Was depressed today and then a new mike video dropped and now I’m genuinely happy to get my tear down fix
1
Reply
@dine9093
1 day ago (edited)
Did you transfrom into Mr Blobby for a moment there?
2
Reply
@NickNorton
1 day ago
Thanks Mike. Your videos are always interesting.
5
Reply
@KeritechElectronics
1 day ago
Heavy optics indeed... Spare no expense, cost no object. Splendid build quality. The CCD is a thing of beauty!
1
Reply
@YSoreil
1 day ago
The pricing on that sensor is about right, I looked in to these many years ago when they were still in production since it's the only large sensor you could actually buy. Really cool to see one in the wild.
2
Reply
@snik2pl
1 day ago
That leds look like from led projector
Reply
@vincei4252
1 day ago
TDI = Time Domain Integration ?
1
Reply
@wolpumba4099
1 day ago (edited)
Maybe the camera should not be illuminated during readout.
From the datasheet of the sensor (Onsemi): saturation 40300 electrons, read noise 12.5 electrons per pixel @ 18MHz (quite bad). quantum efficiency 62% (if it has micro lenses), frame rate 1 Hz. lateral overflow drain to prevent blooming protects against 800x (factor increases linearly with exposure time) saturation exposure (32e6 electrons per pixel at 4ms exposure time), microlens has +/- 20 degree acceptance angle
i guess it would be good for astrophotography
4
Reply
@txm100
1 day ago (edited)
Babe wake up a new mikeselectricstuff has dropped!
9
Reply
@vincei4252
1 day ago
That looks like a finger-lakes filter wheel, however, for astronomy they'd never use such a large stepper.
1
Reply
@MRooodddvvv
1 day ago
yaaaaay ! more overcomplicated optical stuff !
4
Reply
1 reply
@NoPegs
1 day ago
He lives!
11
Reply
1 reply
Transcript
0:00
so I've stripped all the bits of the
0:01
optical system so basically we've got
0:03
the uh the camera
0:05
itself which is mounted on this uh very
0:09
complex
0:10
adjustment thing which obviously to set
0:13
you the various tilt and uh alignment
0:15
stuff then there's two of these massive
0:18
lenses I've taken one of these apart I
0:20
think there's something like about eight
0:22
or nine Optical elements in here these
0:25
don't seem to do a great deal in terms
0:26
of electr magnification they're obiously
0:28
just about getting the image to where it
0:29
uh where it needs to be just so that
0:33
goes like that then this Optical block I
0:36
originally thought this was made of some
0:37
s crazy heavy material but it's just
0:39
really the sum of all these Optical bits
0:41
are just ridiculously heavy those lenses
0:43
are about 4 kilos each and then there's
0:45
this very heavy very solid um piece that
0:47
goes in the middle and this is so this
0:49
is the filter wheel assembly with a
0:51
hilariously oversized steper
0:53
motor driving this wheel with these very
0:57
large narrow band filters so we've got
1:00
various different shades of uh
1:03
filters there five Al together that
1:06
one's actually just showing up a silver
1:07
that's actually a a red but fairly low
1:10
transmission orangey red blue green
1:15
there's an excess cover on this side so
1:16
the filters can be accessed and changed
1:19
without taking anything else apart even
1:21
this is like ridiculous it's like solid
1:23
aluminium this is just basically a cover
1:25
the actual wavelengths of these are um
1:27
488 525 570 630 and 700 NM not sure what
1:32
the suffix on that perhaps that's the uh
1:34
the width of the spectral line say these
1:37
are very narrow band filters most of
1:39
them are you very little light through
1:41
so it's still very tight narrow band to
1:43
match the um fluoresence of the dies
1:45
they're using in the biochemical process
1:48
and obviously to reject the light that's
1:49
being fired at it from that Illuminator
1:51
box and then there's a there's a second
1:53
one of these lenses then the actual sort
1:55
of samples below that so uh very serious
1:58
amount of very uh chunky heavy Optics
2:01
okay let's take a look at this light
2:02
source made by company Lumen Dynamics
2:04
who are now part of
2:06
excelitas self-contained unit power
2:08
connector USB and this which one of the
2:11
Cable Bundle said was a TTL interface
2:14
USB wasn't used in uh the fluid
2:17
application output here and I think this
2:19
is an input for um light feedback I
2:21
don't if it's regulated or just a measur
2:23
measurement facility and the uh fiber
2:27
assembly
2:29
Square Inlet there and then there's two
2:32
outputs which have uh lens assemblies
2:35
and this small one which goes back into
2:37
that small Port just Loops out of here
2:40
straight back in So on this side we've
2:42
got the electronics which look pretty
2:44
straightforward we've got a bit of power
2:45
supply stuff over here and we've got
2:48
separate drivers for each wavelength now
2:50
interesting this is clearly been very
2:52
specifically made for this application
2:54
you I was half expecting like say some
2:56
generic drivers that could be used for a
2:58
number of different things but actually
3:00
literally specified the exact wavelength
3:02
on the PCB there is provision here for
3:04
385 NM which isn't populated but this is
3:07
clearly been designed very specifically
3:09
so these four drivers look the same but
3:10
then there's two higher power ones for
3:12
575 and
3:14
520 a slightly bigger heat sink on this
3:16
575 section there a p 24 which is
3:20
providing USB interface USB isolator the
3:23
USB interface just presents as a comport
3:26
I did have a quick look but I didn't
3:27
actually get anything sensible um I did
3:29
dump the Pi code out and there's a few
3:31
you a few sort of commands that you
3:32
could see in text but I didn't actually
3:34
manage to get it working properly I
3:36
found some software for related version
3:38
but it didn't seem to want to talk to it
3:39
but um I say that wasn't used for the
3:41
original application it might be quite
3:42
interesting to get try and get the Run
3:44
hours count out of it and the TTL
3:46
interface looks fairly straightforward
3:48
we've got positions for six opto
3:50
isolators but only five five are
3:52
installed so that corresponds with the
3:54
unused thing so I think this hopefully
3:56
should be as simple as just providing a
3:57
ttrl signal for each color to uh enable
4:00
it a big heat sink here which is there I
4:03
think there's like a big S of metal
4:04
plate through the middle of this that
4:05
all the leads are mounted on the other
4:07
side so this is heat sinking it with a
4:09
air flow from a uh just a fan in here
4:13
obviously don't have the air flow
4:14
anywhere near the Optics so conduction
4:17
cool through to this plate that's then
4:18
uh air cooled got some pots which are
4:21
presumably power
4:22
adjustments okay let's take a look at
4:24
the other side which is uh much more
4:27
interesting see we've got some uh very
4:31
uh neatly Twisted cable assemblies there
4:35
a bunch of leads so we've got one here
4:37
475 up here 430 NM 630 575 and 520
4:44
filters and dcro mirrors a quick way to
4:48
see what's white is if we just shine
4:49
some white light through
4:51
here not sure how it is is to see on the
4:54
camera but shining white light we do
4:55
actually get a bit of red a bit of blue
4:57
some yellow here so the obstacle path
5:00
575 it goes sort of here bounces off
5:03
this mirror and goes out the 520 goes
5:07
sort of down here across here and up
5:09
there 630 goes basically straight
5:13
through
5:15
430 goes across there down there along
5:17
there and the 475 goes down here and
5:20
left this is the light sensing thing
5:22
think here there's just a um I think
5:24
there a photo diode or other sensor
5:26
haven't actually taken that off and
5:28
everything's fixed down to this chunk of
5:31
aluminium which acts as the heat
5:32
spreader that then conducts the heat to
5:33
the back side for the heat
5:35
sink and the actual lead packages all
5:38
look fairly similar except for this one
5:41
on the 575 which looks quite a bit more
5:44
substantial big spay
5:46
Terminals and the interface for this
5:48
turned out to be extremely simple it's
5:50
literally a 5V TTL level to enable each
5:54
color doesn't seem to be any tensity
5:56
control but there are some additional
5:58
pins on that connector that weren't used
5:59
in the through time thing so maybe
6:01
there's some extra lines that control
6:02
that I couldn't find any data on this uh
6:05
unit and the um their current product
6:07
range is quite significantly different
6:09
so we've got the uh blue these
6:13
might may well be saturating the camera
6:16
so they might look a bit weird so that's
6:17
the 430
6:18
blue the 575
6:24
yellow uh
6:26
475 light blue
6:29
the uh 520
6:31
green and the uh 630 red now one
6:36
interesting thing I noticed for the
6:39
575 it's actually it's actually using a
6:42
white lead and then filtering it rather
6:44
than using all the other ones are using
6:46
leads which are the fundamental colors
6:47
but uh this is actually doing white and
6:50
it's a combination of this filter and
6:52
the dichroic mirrors that are turning to
6:55
Yellow if we take the filter out and a
6:57
lot of the a lot of the um blue content
7:00
is going this way the red is going
7:02
straight through these two mirrors so
7:05
this is clearly not reflecting much of
7:08
that so we end up with the yellow coming
7:10
out of uh out of there which is a fairly
7:14
light yellow color which you don't
7:16
really see from high intensity leads so
7:19
that's clearly why they've used the
7:20
white to uh do this power consumption of
7:23
the white is pretty high so going up to
7:25
about 2 and 1 half amps on that color
7:27
whereas most of the other colors are
7:28
only drawing half an amp or so at 24
7:30
volts the uh the green is up to about
7:32
1.2 but say this thing is uh much
7:35
brighter and if you actually run all the
7:38
colors at the same time you get a fairly
7:41
reasonable um looking white coming out
7:43
of it and one thing you might just be
7:45
out to notice is there is some sort
7:46
color banding around here that's not
7:49
getting uh everything s completely
7:51
concentric and I think that's where this
7:53
fiber optic thing comes
7:58
in I'll
8:00
get a couple of Fairly accurately shaped
8:04
very sort of uniform color and looking
8:06
at What's um inside here we've basically
8:09
just got this Square Rod so this is
8:12
clearly yeah the lights just bouncing
8:13
off all the all the various sides to um
8:16
get a nice uniform illumination uh this
8:19
back bit looks like it's all potted so
8:21
nothing I really do to get in there I
8:24
think this is fiber so I have come
8:26
across um cables like this which are
8:27
liquid fill but just looking through the
8:30
end of this it's probably a bit hard to
8:31
see it does look like there fiber ends
8:34
going going on there and so there's this
8:36
feedback thing which is just obviously
8:39
compensating for the any light losses
8:41
through here to get an accurate
8:43
representation of uh the light that's
8:45
been launched out of these two
8:47
fibers and you see uh
8:49
these have got this sort of trapezium
8:54
shape light guides again it's like a
8:56
sort of acrylic or glass light guide
9:00
guess projected just to make the right
9:03
rectangular
9:04
shape and look at this Center assembly
9:07
um the light output doesn't uh change
9:10
whether you feed this in or not so it's
9:11
clear not doing any internal Clos Loop
9:14
control obviously there may well be some
9:16
facility for it to do that but it's not
9:17
being used in this
9:19
application and so this output just
9:21
produces a voltage on the uh outle
9:24
connector proportional to the amount of
9:26
light that's present so there's a little
9:28
diffuser in the back there
9:30
and then there's just some kind of uh
9:33
Optical sensor looks like a
9:35
chip looking at the lead it's a very
9:37
small package on the PCB with this lens
9:40
assembly over the top and these look
9:43
like they're actually on a copper
9:44
Metalized PCB for maximum thermal
9:47
performance and yeah it's a very small
9:49
package looks like it's a ceramic
9:51
package and there's a thermister there
9:53
for temperature monitoring this is the
9:56
475 blue one this is the 520 need to
9:59
Green which is uh rather different OB
10:02
it's a much bigger D with lots of bond
10:04
wise but also this looks like it's using
10:05
a phosphor if I shine a blue light at it
10:08
lights up green so this is actually a
10:10
phosphor conversion green lead which
10:12
I've I've come across before they want
10:15
that specific wavelength so they may be
10:17
easier to tune a phosphor than tune the
10:20
um semiconductor material to get the uh
10:23
right right wavelength from the lead
10:24
directly uh red 630 similar size to the
10:28
blue one or does seem to have a uh a
10:31
lens on top of it there is a sort of red
10:33
coloring to
10:35
the die but that doesn't appear to be
10:38
fluorescent as far as I can
10:39
tell and the white one again a little
10:41
bit different sort of much higher
10:43
current
10:46
connectors a makeer name on that
10:48
connector flot light not sure if that's
10:52
the connector or the lead
10:54
itself and obviously with the phosphor
10:56
and I'd imagine that phosphor may well
10:58
be tuned to get the maximum to the uh 5
11:01
cenm and actually this white one looks
11:04
like a St fairly standard product I just
11:06
found it in Mouse made by luminous
11:09
devices in fact actually I think all
11:11
these are based on various luminous
11:13
devices modules and they're you take
11:17
looks like they taking the nearest
11:18
wavelength and then just using these
11:19
filters to clean it up to get a precise
11:22
uh spectral line out of it so quite a
11:25
nice neat and um extreme
11:30
bright light source uh sure I've got any
11:33
particular use for it so I think this
11:35
might end up on
11:36
eBay but uh very pretty to look out and
11:40
without the uh risk of burning your eyes
11:43
out like you do with lasers so I thought
11:45
it would be interesting to try and
11:46
figure out the runtime of this things
11:48
like this we usually keep some sort
11:49
record of runtime cuz leads degrade over
11:51
time I couldn't get any software to work
11:52
through the USB face but then had a
11:54
thought probably going to be writing the
11:55
runtime periodically to the e s prom so
11:58
I just just scope up that and noticed it
12:00
was doing right every 5 minutes so I
12:02
just ran it for a while periodically
12:04
reading the E squ I just held the pick
12:05
in in reset and um put clip over to read
12:07
the square prom and found it was writing
12:10
one location per color every 5 minutes
12:12
so if one color was on it would write
12:14
that location every 5 minutes and just
12:16
increment it by one so after doing a few
12:18
tests with different colors of different
12:19
time periods it looked extremely
12:21
straightforward it's like a four bite
12:22
count for each color looking at the
12:24
original data that was in it all the
12:26
colors apart from Green were reading
12:28
zero and the green was reading four
12:30
indicating a total 20 minutes run time
12:32
ever if it was turned on run for a short
12:34
time then turned off that might not have
12:36
been counted but even so indicates this
12:37
thing wasn't used a great deal the whole
12:40
s process of doing a run can be several
12:42
hours but it'll only be doing probably
12:43
the Imaging at the end of that so you
12:46
wouldn't expect to be running for a long
12:47
time but say a single color for 20
12:50
minutes over its whole lifetime does
12:52
seem a little bit on the low side okay
12:55
let's look at the camera un fortunately
12:57
I managed to not record any sound when I
12:58
did this it's also a couple of months
13:00
ago so there's going to be a few details
13:02
that I've forgotten so I'm just going to
13:04
dub this over the original footage so um
13:07
take the lid off see this massive great
13:10
heat sink so this is a pel cool camera
13:12
we've got this blower fan producing a
13:14
fair amount of air flow through
13:16
it the connector here there's the ccds
13:19
mounted on the board on the
13:24
right this unplugs so we've got a bit of
13:27
power supply stuff on here
13:29
USB interface I think that's the Cyprus
13:32
microcontroller High speeded USB
13:34
interface there's a zyink spon fpga some
13:40
RAM and there's a couple of ATD
13:42
converters can't quite read what those
13:45
those are but anal
13:47
devices um little bit of bodgery around
13:51
here extra decoupling obviously they
13:53
have having some noise issues this is
13:55
around the ram chip quite a lot of extra
13:57
capacitors been added there
13:59
uh there's a couple of amplifiers prior
14:01
to the HD converter buffers or Andor
14:05
amplifiers taking the CCD
14:08
signal um bit more power spy stuff here
14:11
this is probably all to do with
14:12
generating the various CCD bias voltages
14:14
they uh need quite a lot of exotic
14:18
voltages next board down is just a
14:20
shield and an interconnect
14:24
boardly shielding the power supply stuff
14:26
from some the more sensitive an log
14:28
stuff
14:31
and this is the bottom board which is
14:32
just all power supply
14:34
stuff as you can see tons of capacitors
14:37
or Transformer in
14:42
there and this is the CCD which is a uh
14:47
very impressive thing this is a kf50 100
14:50
originally by true sense then codec
14:53
there ON
14:54
Semiconductor it's 50 megapixels uh the
14:58
only price I could find was this one
15:00
5,000 bucks and the architecture you can
15:03
see there actually two separate halves
15:04
which explains the Dual AZ converters
15:06
and two amplifiers it's literally split
15:08
down the middle and duplicated so it's
15:10
outputting two streams in parallel just
15:13
to keep the bandwidth sensible and it's
15:15
got this amazing um diffraction effects
15:18
it's got micro lenses over the pixel so
15:20
there's there's a bit more Optics going
15:22
on than on a normal
15:25
sensor few more bodges on the CCD board
15:28
including this wire which isn't really
15:29
tacked down very well which is a bit uh
15:32
bit of a mess quite a few bits around
15:34
this board where they've uh tacked
15:36
various bits on which is not super
15:38
impressive looks like CCD drivers on the
15:40
left with those 3 ohm um damping
15:43
resistors on the
15:47
output get a few more little bodges
15:50
around here some of
15:52
the and there's this separator the
15:54
silica gel to keep the moisture down but
15:56
there's this separator that actually
15:58
appears to be cut from piece of
15:59
antistatic
16:04
bag and this sort of thermal block on
16:06
top of this stack of three pel Cola
16:12
modules so as with any Stacks they get
16:16
um larger as they go back towards the
16:18
heat sink because each P's got to not
16:20
only take the heat from the previous but
16:21
also the waste heat which is quite
16:27
significant you see a little temperature
16:29
sensor here that copper block which
16:32
makes contact with the back of the
16:37
CCD and this's the back of the
16:40
pelas this then contacts the heat sink
16:44
on the uh rear there a few thermal pads
16:46
as well for some of the other power
16:47
components on this
16:51
PCB okay I've connected this uh camera
16:54
up I found some drivers on the disc that
16:56
seem to work under Windows 7 couldn't
16:58
get to install under Windows 11 though
17:01
um in the absence of any sort of lens or
17:03
being bothered to the proper amount I've
17:04
just put some f over it and put a little
17:06
pin in there to make a pinhole lens and
17:08
software gives a few options I'm not
17:11
entirely sure what all these are there's
17:12
obviously a clock frequency 22 MHz low
17:15
gain and with PFG no idea what that is
17:19
something something game programmable
17:20
Something game perhaps ver exposure
17:23
types I think focus is just like a
17:25
continuous grab until you tell it to
17:27
stop not entirely sure all these options
17:30
are obviously exposure time uh triggers
17:33
there ex external hardware trigger inut
17:35
you just trigger using a um thing on
17:37
screen so the resolution is 8176 by
17:40
6132 and you can actually bin those
17:42
where you combine multiple pixels to get
17:46
increased gain at the expense of lower
17:48
resolution down this is a 10sec exposure
17:51
obviously of the pin hole it's very uh
17:53
intensitive so we just stand still now
17:56
downloading it there's the uh exposure
17:59
so when it's
18:01
um there's a little status thing down
18:03
here so that tells you the um exposure
18:07
[Applause]
18:09
time it's this is just it
18:15
downloading um it is quite I'm seeing
18:18
quite a lot like smearing I think that I
18:20
don't know whether that's just due to
18:21
pixels overloading or something else I
18:24
mean yeah it's not it's not um out of
18:26
the question that there's something not
18:27
totally right about this camera
18:28
certainly was bodge wise on there um I
18:31
don't I'd imagine a camera like this
18:32
it's got a fairly narrow range of
18:34
intensities that it's happy with I'm not
18:36
going to spend a great deal of time on
18:38
this if you're interested in this camera
18:40
maybe for astronomy or something and
18:42
happy to sort of take the risk of it may
18:44
not be uh perfect I'll um I think I'll
18:47
stick this on eBay along with the
18:48
Illuminator I'll put a link down in the
18:50
description to the listing take your
18:52
chances to grab a bargain so for example
18:54
here we see this vertical streaking so
18:56
I'm not sure how normal that is this is
18:58
on fairly bright scene looking out the
19:02
window if I cut the exposure time down
19:04
on that it's now 1 second
19:07
exposure again most of the image
19:09
disappears again this is looks like it's
19:11
possibly over still overloading here go
19:14
that go down to say say quarter a
19:16
second so again I think there might be
19:19
some Auto gain control going on here um
19:21
this is with the PFG option let's try
19:23
turning that off and see what
19:25
happens so I'm not sure this is actually
19:27
more streaking or which just it's
19:29
cranked up the gain all the dis display
19:31
gray scale to show what um you know the
19:33
range of things that it's captured
19:36
there's one of one of 12 things in the
19:38
software there's um you can see of you
19:40
can't seem to read out the temperature
19:42
of the pelta cooler but you can set the
19:44
temperature and if you said it's a
19:46
different temperature you see the power
19:48
consumption jump up running the cooler
19:50
to get the temperature you requested but
19:52
I can't see anything anywhere that tells
19:54
you whether the cool is at the at the
19:56
temperature other than the power
19:57
consumption going down and there's no
19:59
temperature read out
20:03
here and just some yeah this is just
20:05
sort of very basic software I'm sure
20:07
there's like an API for more
20:09
sophisticated
20:10
applications but so if you know anything
20:12
more about these cameras please um stick
20:14
in the
20:15
comments um incidentally when I was
20:18
editing I didn't notice there was a bent
20:19
pin on the um CCD but I did fix that
20:22
before doing these tests and also
20:24
reactivated the um silica gel desicant
20:26
cuz I noticed it was uh I was getting
20:28
bit of condensation on the window but um
20:31
yeah so a couple of uh interesting but
20:34
maybe not particularly uh useful pieces
20:37
of Kit except for someone that's got a
20:38
very specific use so um I'll stick a
20:42
I'll stick these on eBay put a link in
20:44
the description and say hopefully
20:45
someone could actually make some uh good
20:47
use of these things
Example Output:
**Abstract:**
This video presents Part 2 of a teardown focusing on the optical components of a Fluidigm Polaris biotechnology instrument, specifically the multi-wavelength illuminator and the high-resolution CCD camera.
The Lumen Dynamics illuminator unit is examined in detail, revealing its construction using multiple high-power LEDs (430nm, 475nm, 520nm, 575nm, 630nm) combined via dichroic mirrors and filters. A square fiber optic rod is used to homogenize the light. A notable finding is the use of a phosphor-converted white LED filtered to achieve the 575nm output. The unit features simple TTL activation for each color, conduction cooling, and internal homogenization optics. Analysis of its EEPROM suggests extremely low operational runtime.
The camera module teardown showcases a 50 Megapixel ON Semiconductor KAF-50100 CCD sensor with micro-lenses, cooled by a multi-stage Peltier stack. The control electronics include an FPGA and a USB interface. Significant post-manufacturing modifications ("bodges") are observed on the camera's circuit boards. Basic functional testing using vendor software and a pinhole lens confirms image capture but reveals prominent vertical streaking artifacts, the cause of which remains uncertain (potential overload, readout artifact, or fault).
**Exploring the Fluidigm Polaris: A Detailed Look at its High-End Optics and Camera System**
* **0:00 High-End Optics:** The system utilizes heavy, high-quality lenses and mirrors for precise imaging, weighing around 4 kilos each.
* **0:49 Narrow Band Filters:** A filter wheel with five narrow band filters (488, 525, 570, 630, and 700 nm) ensures accurate fluorescence detection and rejection of excitation light.
* **2:01 Customizable Illumination:** The Lumen Dynamics light source offers five individually controllable LED wavelengths (430, 475, 520, 575, 630 nm) with varying power outputs. The 575nm yellow LED is uniquely achieved using a white LED with filtering.
* **3:45 TTL Control:** The light source is controlled via a simple TTL interface, enabling easy on/off switching for each LED color.
* **12:55 Sophisticated Camera:** The system includes a 50-megapixel Kodak KAI-50100 CCD camera with a Peltier cooling system for reduced noise.
* **14:54 High-Speed Data Transfer:** The camera features dual analog-to-digital converters to manage the high data throughput of the 50-megapixel sensor, which is effectively two 25-megapixel sensors operating in parallel.
* **18:11 Possible Issues:** The video creator noted some potential issues with the camera, including image smearing.
* **18:11 Limited Dynamic Range:** The camera's sensor has a limited dynamic range, making it potentially challenging to capture scenes with a wide range of brightness levels.
* **11:45 Low Runtime:** Internal data suggests the system has seen minimal usage, with only 20 minutes of recorded runtime for the green LED.
* **20:38 Availability on eBay:** Both the illuminator and camera are expected to be listed for sale on eBay.
Here is the real transcript. What would be a good group of people to review this topic? Please summarize provide a summary like they would:
Create
It's Not About The API - Fast, Flexible, and Simple Rendering in Vulkan
Games by Mason
130 subscribers
Subscribe
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
Share
Ask
Save
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
,
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
0
1
2
3
4
5
6
7
8
9
views
Feb 7, 2026
My Vulkan talk from HMS 2024.
Slides: https://gamesbymason.com/hms2024/
Newsletter: https://gamesbymason.com/newsletter/
Blog: https://gamesbymason.com/blog/
Handmade Cities: https://handmadecities.com/
This Vulkan rendering talk goes beyond API specifics, focusing on hardware capabilities. Learn core concepts like command buffers and pipelines, then discover a surprisingly simple rendering approach. The presenter demonstrates techniques for fast, flexible rendering, even on older hardware.
Summary
Ask
Get answers, explore topics, and more
Ask questions
Transcript
Follow along using the transcript.
Show transcript
Games by Mason
130 subscribers
Videos
About
Discord
Twitter
Instagram
Facebook
16 Comments
Add a comment...
@Cristian_DG
8 hours ago
This talk changed the way i look at Vulkan from impossible to doable. Great talk!
23
Reply
2 replies
@john-sr3fi
6 hours ago
really nice, this somehow has been the best introduction to Vulcan for beginners!
5
Reply
@phizc
5 hours ago
Design for hardware, not graphics API is fantastic advice!
2
Reply
@LucasDevelop
6 hours ago
Thanks for the nice video :D
3
Reply
@snbv5real
2 hours ago (edited)
Uber shaders can cause over-register allocation, you need to make sure that the branch is either uses "approximately" the same amount of variables, or your using specialization constants to switch. GLSL also does support uin16, as well as uint8 and uint64 with #extension GL_EXT_shader_16bit_storage, or using the total numeric extension using uint16_t. This has been true for almost a decade.
3
Reply
@FoZard
5 hours ago
solid talk. very helpful
1
Reply
@NikolaNevenov86
1 hour ago
Yeah Vulkan setup code is huge ...but once that's up its pretty straight forward. The biggest hurdle for me was the pipeline layout,descriptor sets, descriptor layout, and the synchronization part. The rest is pretty easy to understand.
1
Reply
@PaulSebastianManole
34 minutes ago
1:58 oh, I get it, because GPUs are exorbitantly expensive now he couldn't find one to photograph.
Reply
@AyushBakshi
5 hours ago (edited)
we really should purge MS higher ups until the good ones come.
Reply
@eudorian111
3 hours ago
39:00 So this is also how the latest version of OpenGl would work. Upload a bunch of buffers with all the data in it, have a giant shader that does everything. But now... if you run it on any other hardware than NVidia, it breaks. How will Vulkan prevent this? How will my Intel IGP from 2009 behave as you would expect with Vulkan?
Reply
@ABaumstumpf
4 hours ago
The API listing at the start... sadly Khronos killed OpenGL despite having had the widest support, having everything from lowleve to highlevel, and decent documentation (Wow Vulkan was a complete and utter mess when introduced - anything but useable).
Also "Run by nonprofit" is, at least in my book, a downside. Contrary to for-profit organisations there i nearly no checking in place - where is the money spent, what are the future goals, who is doing what. You don't need to look further than the Linux-foundation to see that (hint: That foundation claims to try to unify linux.... but it spents all the money on everything else, just NOT linux, but has often impeded linux and opensource in general).
A Problem of Vulkan is specifically the way it was designed:
AMD created their own new API to target their own hardware (AMD had a huge problem with bugs and performance of their drivers) that pushed all responsibilities to the developers. Khronos took that as the basis and went "Yep, lets invent a new future API" and did just that, from their ivory tower, with no consideration for reality. All the stuff that the GPUs them self are better at handling, all the things that APIs are for - Vulkan does none of those. Instead of just getting some texture-memory and using it you now have to ask the GPU if it even supports textures, then which formats, what draw-commands, how it interacts with other buffers, and then build all those things your self.... you know, stuff that is fixed for any GPU and the driver could do with basically 0 overhead cause the driver KNOWS what the hardware can do.
I would say Vilkan is not a graphics-API but a toolbox to writing custom, not reuseable, hardware- and software-specific, drivers for every single GPU with a lot of runtime-evaluation.
The benefit of OLD OpenGL (1 to 2) was simply simplicity. You could just start with something really simple and then expand on that. With Vulkan "Use a 3rd party library to does all the abstraction for you and don't use vulkan" is basically the only way to get started if you don't plan on spending 200 hours upfront to learn the stuff and bite into the thousands of arcane problems you'll encounter. In Vulkan you just don't have the choice of not doing all the steps manually. "Yeah it is complicated and more code but it has to be" is the premise of the talk... aka it IS complex not actually of any benefit to getting started, nor helpful for most small projects.
2
Reply
@obinnaokafor6252
4 hours ago
Nobody agreed with the opengl , it does not have anything to do with Microsoft. Moreover , Direct X is far better than them all
Reply
@GillesLouisReneDeleuze
8 hours ago
chat is dis tru
3
Reply
@gsestream
1 hour ago
vulkan api is bloatware.
Reply
Top is selected, so you'll see featured comments
Transcript
0:00
It says it's not about the API and I want to get your thoughts on that, Mason. We all want to know. Give us a
0:05
round of applause.
0:11
No problem. More free. I'll take this. More free form. And does this go back within 40?
0:16
Yes. So this is the big fat green button for next. The red one is back. And if you want a point, you can also do
0:22
that. Awesome. Okay. Yeah. Enjoy.
0:29
Hey, thanks for the introduction, Abner. Uh, thanks for having me again. Uh, always look forward to Handmade. It's
0:34
one of my favorite conferences. Um, so yeah, let me get this started for you guys. Uh, I'm Mason Ramali. I'm an
0:41
independent game developer, a board member on the Zig software foundation. And in the past, I taught grad and undergrad students to make games and
0:47
game engines. Um, like Abner mentioned, I shipped my game Way of Rhea in May.
0:53
Um, and I learned a lot from making it. uh talk more about that in a talk at cle
0:59
you can look up on YouTube. Uh but one of the uh kind of outcomes there is I decided I want to build a new engine for
1:05
my next game. Uh way of ray was built in OpenGL and as many of you are probably
1:11
aware there's a bunch of new APIs that are replacing kind of the old style graphics APIs and so I decided if I'm
1:17
making a new engine it's time to take a look into these new APIs and see if it's time for me to upgrade. Um, so that's
1:24
what we're going to talk about today. Graphics APIs. How many how many of you also know OpenGL?
1:30
How many of you are comfortable with the uh new modern APIs? Okay. So, we got like uh that's good.
1:36
So, so we got a bunch of people the uh for for the people, you know, online who maybe can't see the hands raised. Bunch
1:42
of people used to the old APIs. Maybe not as many with the new APIs. I hope this will be maybe uh informative to
1:47
people working with the old APIs, thinking about trying out the new APIs. This is my I don't know, maybe it's my
1:53
five or six month journey um looking into them. So, yeah, let's start off with I think we all have a sense for
1:59
this, but let's talk about what a graphics API is. It's supposed to be a picture of a graphics card there, but it's not really a big deal. Uh you all
2:04
know what they look like, and if you don't, it uh it's that box in your computer that makes it go fast. Uh so,
2:12
uh yeah, this is what it looks like when it doesn't do its job. Uh when you're not good enough at Vulcan. Um, so yeah,
2:19
I mean a graphics API is a way to talk to your GPU, right? Uh, you can't just go run x64 on it. You can't run write
2:26
well there are actually ways you could run C on your GPU, but not in the normal way. You're not going to compile a
2:31
Windows executable file and run it on your GPU. So a graphics API is essentially just a protocol that you use
2:37
to talk to the GPU. Um, now the thing is there's a few different GPU vendors,
2:42
right? Uh, and since there's a few different GPU vendors, you get uh a few different graphics APIs.
2:49
Well, are you telling me those aren't those aren't GPU vendors on the screen? Wait, are you telling me that there's
2:55
three different graphics APIs and three different GPU vendors and all three GPUs
3:01
support all three APIs, but which ones you get to use depends on the operating system you're running? That's a
3:06
fundamentally silly situation to be in. Uh, unfortunately, we live in a fundamentally silly world. So, uh, so
3:13
yeah, you could be, you know, you could have your MacBook and you could reboot it into Linux and now you need to use a different graphics API to talk to the
3:20
same graphics card. Uh, and also all of these APIs are exactly the same as each other. Like if you're worried about
3:27
which one to use, it doesn't matter. They're the like details a little different, the windowing system integration is different, they're they
3:34
do all the same stuff because it's the same hardware. Uh so today we're going to be talking about Vulcan because it
3:40
runs the most places and it's managed by a nonprofit. Uh but if you're not working in Vulcan, all the same stuff's
3:45
going to apply. It's literally like uh these APIs are almost identical. Um in
3:52
particular, I'm interested in for for my game. I I want it to run everywhere. I mean, not that's not true. I want it to
3:57
run on PC platforms and I want it to run on uh on consoles. Um, and as part of
4:04
that, I'm building like a, you know, a GPU abstraction library like everyone ends up doing because you need to support all these APIs. And so whenever
4:10
I'm adding a new feature, I'm checking like, okay, what would it look like in DirectX? And the answer is exactly the same almost every time. Uh, so yeah, if
4:17
you're doing DirectX, no problem. Uh, we're going to do a brief history of graphics APIs here. I wasn't around for
4:23
most of this, very hazy details, but I just kind of want to build to like why we have the modern APIs we have today
4:29
and what's different about them. So in the early days we had uh vendor specific APIs. You know if you wanted to run a
4:36
certain graphics card you'd use the API for that card. Makes enough sense. Uh but there's you know there's a downside
4:42
which is kind of familiar where you have to use a bunch of different APIs right? If you want to support multiple cards you end up writing multiple APIs and
4:48
that's kind of a pain. Um around 92 uh one of those APIs Irishg uh became
4:53
OpenGL. It was released as an open standard that uh other graphics AP uh other graphics card vendors could
4:59
implement. uh which is a really great idea, right? Because now you can write one API and run it everywhere. But this
5:07
idea has one fatal flaw. And the fatal flaw is that it's a good idea. And you
5:12
know who doesn't like good ideas? Microsoft. Yep. Microsoft. So uh Microsoft of
5:18
course had to make their own OpenGL and call it DirectX. So you still end up with two APIs that are the same as each other. Um around 2004 we started moving
5:26
uh away from fixed function to the shader model. Uh the idea here is that you could write, you know, with the fixed function model, you're kind of
5:32
just like configuring your GPU to do specific things. Uh with the shader model, you get to actually start
5:38
running. I think you could do this before maybe with some kind of like assembly or something. But with the shader model, you can actually write like a text program that runs on your
5:44
GPU and does nice special effects and stuff. Maybe you want to make a shiny object or a bumpy object. You can just
5:50
like write the code for your game to do that. Um gives you a lot more control. Of course, it also like makes things
5:56
more complicated, right? like every time we step to the right on this graph, uh it's going to things are going to get a little bit more involved. Um if you're
6:03
just doing fixed function, you just kind of tell the graphics API to draw some stuff and it does it and that's it. Here
6:08
you have to write a bunch more code. Uh around 2014, we started getting the modern APIs and they become increasingly
6:15
relevant. Uh the idea is you know people are doing more and more stuff with GPUs and it's not necessarily just graphics.
6:21
Uh in a way GPUs are just kind of giant kind of giant SIMD machines. Uh you know single instruction multiple data set you
6:28
have like you know maybe a multiply operation and you want to do it on a bajillion inputs on the same time it's
6:33
the same operation but there's a lot of different data and graphics is really amenable to that kind of uh optimization
6:38
but there's lots of other things that can be done quickly that way and people want to use GPUs to do all kind of cool stuff uh so GPUs uh the GPU APIs got a
6:47
lot lower level you got a lot more control um you got the ability to do way more with them but as such it got you
6:52
know they got more complicated harder to work So let's look at uh this triangle here.
6:59
We're going to start in uh a fixed fixed function OpenGL sample I found online.
7:04
Uh so I know you all probably can't read this from from there, but you know it's about 40 lines of code. And the gist of it is it says here are some vertices,
7:11
here are some colors, and then let's just like loop and and draw that triangle. Uh and even if you're not a
7:17
graphics programmer, if you you know pull this up at home, you can look at it in a couple minutes, understand what's
7:22
going on. It's not too complicated. Um, now what I'm interested in here is
7:28
should I move to Vulcan for my next game. Uh, so let's look at the same code
7:33
but for Vulcan. Okay, so it's like a little bit longer, but a lot of it's like comments and whites space. So, you
7:39
know, it's not really a big deal. Oh, sorry. There's another slide. Okay, so it's like a little bit longer, but most
7:44
of it's comments and whites space. So, it's not really a big deal. It's a little more complicated. Sorry, there How many How many pages of this are
7:51
there? Okay. You know, uh geez. Okay. May maybe
7:59
I see why uh maybe I see why people are complaining. Okay. So, we took 40 lines and we turned it into what's that 1,180
8:06
lines. Um there are some more comments in whitespace, but uh not like almost any in this last whole two pages. Uh
8:14
yeah, so it's a lot more code. There's a lot more going on here. Um and and and remember that's just for this, right?
8:22
like this is not uh this is not what my game's going to look like. It's going to do a little bit a little bit more than
8:27
that. So like if that's how much code it is just to make this, imagine what the full game's going to be like. And also like I'm an indie and I'm making an
8:34
engine and like you know understandably most indies aren't making engines. I'm the only programmer on the team. So it's like I have to build the game. The I
8:40
have to build the engine and the the renderer is not the whole engine. It's just a part of it. I' have to build the renderer, build the engine, and build
8:45
the game and I'm supposed to do all that just to get a triangle on the screen. I kind of get why people feel like this.
8:56
So, okay, here's the thing, though. You may not like it, but this is what
9:01
peak performance looks like. Uh, also, it's simpler. What are you
9:09
talking about? It's simpler. Uh, so, uh, I I I actually think Vulcan's great. Um,
9:14
I was not sure that I would think that when I started. And so if you're if you're coming at this and you're thinking, "Really? 1,100 lines for
9:21
something that could have been 40? Ridiculous. I'm just going to go use OpenGL forever." That's fine. You know,
9:27
do whatever you want, but but uh I want to maybe make a pitch to you that it's it's worth looking into these new APIs and that it's actually not as goofy as
9:33
it maybe seems at first glance. Um but I understand that I have my work cut out for me here. And I'm also claiming it's
9:38
simpler. It's kind of how is more code simpler? Um so we're going to we're going to start out with some Vulcan
9:44
concepts. I'm going to walk through like I'm not going to go line by line. We don't really have time for that and I don't think it would be particularly engaging. But we're going to go through
9:51
just like the core concepts that are like kind of going on that are making it longer. Uh not going to super deep dive
9:57
on each of them. There's other talks for like introductions to writing Vulcan, but just like why is it longer? What's
10:02
happening in all that extra code? Uh once we have a kind of rough understanding of what's actually going
10:08
on, I'm going to go over uh a way of doing fast, flexible, and simple rendering in Vulcan uh that I'm using in
10:14
my new engine. It's not the only way to do good rendering in Vulcan. Uh but I want to show you that it is possible to
10:20
keep things simple while using these new APIs and still benefit from them. And then lastly, uh this is kind of maybe
10:26
shifted my thinking a little bit about like APIs and what makes an API good or bad, how to evaluate one, how to design one. and maybe there'll be some
10:33
takeaways even if you're not uh actually a graphics programmer. Okay, so let's start out with some
10:38
Vulcan concepts. Um, and we're going to kind of compare them to OpenGL. And it seems like a lot of the audience is familiar with OpenGL. So I think that'll
10:44
be helpful. So first things first, uh, you're going to you look at a Vulcan sample, you're going to run into this
10:50
thing called a physical device and a logical device. Uh, the physical device is just the GPU that you're that you're
10:57
using. And the logical device is kind of like an instance of it that you've configured a little bit. Uh and if you think about like OpenGL, it doesn't
11:04
really have this concept, right? Like uh when you're writing an OpenGL app, you
11:09
just kind of you create a context and then you just like write some code and you don't get control over what device
11:14
it runs on, which is kind of a problem, right? Like I got a picture of my laptop there. If you have a laptop, it probably
11:20
has two GPUs. It probably has your integrated GPU on the CPU and it has a separate discrete GPU, which is going to be way faster. Um, if you're running a
11:28
game, you really want it to go on the fast GPU, not the slow GPU. Like, you know, the gamers probably spent a bunch
11:33
of money on their GPU. If they buy your game and then it runs slow because it didn't run on the fancy GPU, they're not
11:39
going to know that's why. They're just going to think your game's slow and that's a pretty sad outcome, right? Like
11:44
you wrote something that was fast enough and it just ran on the wrong hardware. Um, well, OpenGL doesn't really have an
11:50
answer for you. And it's kind of Microsoft's fault. Like I I believe DirectX, old DirectX still had a
11:55
solution for this, but Microsoft didn't want you to use OpenGL, so they never updated their window integration code. Uh but regardless, you couldn't do this
12:01
in OpenGL. So the the sample code is shorter, right? The samples code shorter because you just literally skip this step. Um there was a way to do it that
12:09
OpenGL doesn't have an answer. To be clear, OpenGL games do have a solution for this. The solution is you like declare a weird symbol and set it to a
12:15
specific value and then like the driver looks for it. Like there's a way to do it, but it's like mega jank. And like you good luck figuring it out if you
12:21
don't already happen to know about how to do it. It's probably in the it's not in the speaker notes anyway. Ask me
12:27
after if you're like, "How do I make that work in OpenGL?" I have an OpenGL game. I can show you the line of code. Uh but yeah, so you're going to get a
12:33
few lines of code just to like pick the right GPU. But like I mean, you want to do that anyway, don't you? Uh
12:39
next concept, uh command buffers. So in Vulcan, uh, when you want to tell the
12:44
GPU what to do, instead of giving it one, you have to remember like GPUs are asynchronous, right? Like it's a separate device. It's not like you're
12:50
running the code on the CPU. So you're going to have to send a command to the GPU to tell it to do some stuff. At some
12:55
point, it's going to do it and complete. Uh, you don't send one command at a time. You send commands in buffers of
13:01
commands. And OpenGL under the hood, that's, you know, presumably the same thing that's happening. But OpenGL doesn't expose the command buffers to
13:07
you. Uh so in in Vulcan it's like you have to actually pick like a queue which
13:13
is a port on the GPU essentially like each each port maybe is like better at doing different operations like maybe
13:18
you have a port for transfers a port for compute port for graphics. Um maybe you
13:23
just have one for everything. I think the graphics ones always do everything. Um and then you you know have a command
13:28
buff for allocator and you allocate your command buffers from there. You fill them up. You send them to the GPU. So, you know, it's some extra extra stuff
13:34
you're doing to get the commands there as compared to OpenGL where you just like send a command and it makes it more code. And in the triangle example, it
13:41
definitely inflates the size of our code. But if you've how many people have shipped something in OpenGL?
13:47
How many of you reinvented command buffers in your OpenGL app? Like, okay, yeah, almost all of you. Like only like
13:53
two hands went down. So, like, yeah, you know, our triangle example is shorter because we didn't need to reinvent command buffers in it. But the reality
13:59
is command buffers are a super useful structure. when you're like, you know, building a more complicated graphics app
14:05
and everyone writing OpenGL code is reinventing them anyway. So your actual Vulcan app like didn't have to reinvent
14:12
them because they already existed in the API. I may maybe you have a software command buffer abstraction, but you don't need it. Uh so yeah, I I don't
14:19
need to dwell on command buffers. It seems like you all made your own command buffers anyway, so you should be pretty familiar with the concept now. You don't
14:24
need to make them anymore unless you unless you want to. Uh okay, we got pipelines. Um, so the
14:30
idea of a pipeline is that you're going to you're going to have the GPU render some stuff and there's a bunch of ways the GPU can be configured, right?
14:37
There's a bunch of like fixed function options that you can change. You can also decide like what shader you're going to run. Uh, in something like
14:43
OpenGL, you just kind of say like turn this op you have function calls to turn
14:48
options on and off. So you can be like, I want to set my blend mode to additive because I want to make a lightsaber and
14:53
it's going to be all glowy. Uh, and then I want to turn it back to normal to render something that's not supposed to be glowy. um you just have a bunch of
14:59
different functions you call to toggle all of these things and then eventually you call draw. Uh and it's just like a
15:04
giant state machine. Everything's global. Uh it's kind of hard to manage. Um in Vulcan, all of these options get
15:11
baked into a pipeline and then you bind the pipeline. So say that you have uh a
15:16
bunch of particle effects that all are glowy and they're all additively blended and they have I don't know some of these
15:22
other settings are some set to specific values. you you put them in one pipeline and then you can switch to a different pipeline for your other objects. Um
15:28
what's nice about this is uh you kind of know where the cost is, right? In OpenGL, you know, you're writing a
15:34
little bit less code because you can just at the time you're going to draw the object set the configuration
15:39
options. Um but uh it's a little bit hard to understand which options are expensive to change and which options
15:46
are cheap to change. Uh sometimes you you find your game running really slow and you realize, oh, I wasn't really
15:51
supposed to change that one a bajillion times per frame. and the other option it was fine to change a bunch of times per frame. I mean, if you really know how
15:57
GPUs work, I imagine that you would kind of understand implicitly which ones are slow and fast. But, you know, me when I
16:02
was starting out with OpenGL, I had no idea. I would just have to experiment and ask people online which options are fast, which ones are slow. Um, and well,
16:10
can you just, you know, build this all into a pipeline. Um, so yeah, it's more code. Uh, but you're also kind of being
16:16
a little bit more honest about what the computer is actually doing. Um, we got a few other concepts here. You got the uh pipeline layout which is just kind of
16:22
saying the structure of the data that you send into the shaders. Um relates to the descriptor sets which is like the
16:27
actual kind of uh like the uniform data and stuff like the options you pass into your shader. Uh and the layout is just
16:33
the layout of that data. We're I'm glossing over the details because I think all all the all that stuff you can
16:38
find online. But the point is that this is a useful concept. It's more code you're writing. But you know often you
16:44
actually maybe not as common as command buffers, but it's often that in OpenGL you'd end up creating something like this. Anyway, just to get away from the
16:50
mass amount of global state that OpenGL has. Okay, another one we got uh some of the
16:56
older APIs maybe would have had would have this if they had better Windows system integration, but OpenGL doesn't really have a concept of it uh as the
17:02
swap chain. Um the idea here is that you know long gone are the days of rendering pixels straight to the screen. You're
17:09
going to be rendering pixels into a buffer somewhere that the uh window manager is going to display on the screen at some point. Uh and as such,
17:17
you know, there's there's different strategies that the that that could be used to display your images. You know, for example, you could be using a kind
17:24
of standard VSYNC setup. There's details even within that, but uh you could be
17:29
using like, you know, triple buffering. There's lots of different ways to configure how the images you render are showed on screen, how often they're
17:35
showed on screen, uh what order, how deep the buffer is, which is going to affect input latency. And, you know, all
17:42
of these things can affect how smooth your animation is, what the cut off is, like if you if you miss your frame, do
17:47
you have to wait until the next, you know, VSync or like does it show right away but have tearing? There's, you
17:53
know, there's lots of reasonable trade-offs you can make, but the point is is you probably really care about this stuff if you're if you're making games. you probably really care about
17:59
when your frames are presented uh and you know with exactly what parameters and a swap change is just how you
18:05
configure that you know in in something like OpenGL you can there's some extensions for controlling vsync but
18:10
they're they're never implemented correctly on any of the vendors I've tried them like they work the basics work but the details never really work
18:16
correctly and the user can always override it and you just don't get a lot of control I would argue that even in Vulcan the main swap chain extension I
18:23
would like it to be lower level than it actually is so you know it's more code that you're writing but you know, you you probably want your
18:29
animations to be smooth if you're making a game, right? So, you probably want to know how your images are being presented.
18:35
Uh we got uh synchronization. So, yeah, I'm not going to lie, synchronization in Vulcan's complicated, right? Uh but like
18:42
we all know that multi-threaded programming is hard, right? Like uh everyone's aware this is a this is a
18:48
difficult thing to do, right? And this is not, you know, when when you're doing graphics programming, not only are you doing multi-threaded programming, you're
18:54
doing multi-threaded programming on a separate device from the main device. So like, of course, it's hard, right?
18:59
There's complicated stuff going on. Um, this is this is it's going to be a
19:04
challenge, but like the same someone had to do the same
19:10
work to make your OpenGL code work. It just wasn't you. Like the complexity is still there, but like OpenGL had to
19:15
assume the worst case because OpenGL pretends to be synchronous, right? like it for the there's like sync and flush,
19:21
but for the most part, you just kind of write your code and it all just works, which it doesn't and it doesn't know what you're going to do, which means
19:27
it's assuming the worst case. Uh, and you're giving up a lot of performance. So, yeah, you got to spend some time
19:32
figuring out this and it is going to result in more code that you wouldn't have had to write in OpenGL, but like you have control over your
19:39
multi-threading, which you probably want. You're trying to go fast, right? So, uh, it's hard for me to complain
19:45
about that. Uh, a big one that I see a lot of people complain about is memory management. And
19:51
it's worth mentioning here, like I am glossing, like I'm just kind of flying past a lot of details. I have like stuff
19:57
in the speaker notes that if you're like going home and you're writing uh in your
20:02
in your like writing Vulcan that might be useful to you that I'm not going to read here because it's just like quotes
20:08
from the specification and stuff, but there's like you I'll have a link to the slides at the end and you're welcome to like look at them on your own devices
20:14
and reference them. Uh but yeah, the gist of it is that uh when you use something like OpenGL, it basically just
20:20
gives you a heap allocator and it's like reference counted memory. Um and so it
20:26
is like very easy to get started because it's all automatic. Um but like you know most most of you are not writing your
20:32
game engines in garbage collected languages. So why would you want your GPU API to be garbage collected, right?
20:38
like like that's probably not if you really think about it like you're used to it, but like is that really a choice you would have made for yourself?
20:45
Probably not, right? You'd probably just use like an arena allocator or do stuff per level or something. We'll talk more
20:51
about that later. Um, furthermore, GPUs have many different types of memory uh that are like good for different things,
20:58
and you do have some flags in OpenGL that control this stuff, but it's kind of hard to understand what exactly
21:03
they're for, and you only have so much control, and you can't really query like what does this GPU actually support. You
21:09
just kind of hint at what you want and maybe you get that, maybe you don't. Um, in Vulcan, it's very, very manual. You
21:16
essentially get a page allocator. I don't know if exactly correct to call it that but the gist of it is that you are not really supposed to allocate small
21:23
amounts of memory. You're supposed to just say give me a bunch of memory and I will deal with dividing it up. Um and on
21:30
the surface this is a lot more work and a lot of people will recommend using something like a Vulcan memory allocator which is a totally totally legitimate uh
21:37
way to do this. Uh I'm going to kind of make an argument for a simple approach later on. Uh but if you were just kind
21:43
of to come at this uh for the first time, you probably would end up with something pretty complicated. But we'll we'll come back
21:48
on how I think you can do this pretty simply. Uh CPU side, it's not really too different from something like OpenGL.
21:55
The driver objects are the drivers going to allocate them how it wants to allocate them. Uh so there are other features. I think
22:03
those are the the main ones that I'm interested in today. The takeaway here is that like the the complexity that you
22:08
see in Vulcan like existed when you were writing OpenGL. It's just that you weren't the one writing it, right? It
22:14
was a graphics driver uh implementing most of it. And I think a lot of people I have seen a lot of people online
22:20
saying things like uh you know Vulcan's not good for indies. Like you know maybe if you're a big AAA studio it's good but
22:26
for indies it's it's too much work. And and I I honestly kind of feel the the other way around in that like it is a
22:33
bunch of work, but you know, you only have to do this stuff once. And you
22:38
know, Nvidia isn't going to meet with me to optimize my game for their new GPU. You know, like uh uh I I don't have that
22:47
kind of uh mass audience. It's not going to affect their bottom line. Uh so I
22:52
don't want to be at the whim of the GPU drivers, right? Like if I write it in Velcin, I get to decide much more like
22:59
what's actually happening. Whereas if I was to write it in OpenGL, the GPU is making all these judgment calls about how to interpret my intentions and to
23:06
try to optimize them. And if you're a big studio, they'll work with you directly. And I don't I just don't really think that's realistic for an
23:12
indie. So like I actually think as an indie having this low-level control is it's almost like an equalizing force,
23:17
right? Like we all have the option to fully take advantage of the GPU if we can use the GPU at a low level. Whereas
23:24
if you're writing this high level API and you're just dependent on uh making a deal with the uh company that made the
23:30
GPU that that I don't know that feels less fair to me. Um so yeah, I mean the complexity is there either way. Someone
23:36
has to write this code in the new APIs. It's you. It's more work but uh you get
23:41
to go faster. Um and you don't have these issues where like your shader worked on one person's GPU and not another because you're the one compiling
23:47
the shader. Uh all right. So, let's let's let's get
23:53
into actually using some Vulcan. So, talk about fast, flexible, and simple rendering in Vulcan. Um, this is, like I
24:00
said, not the only way to use Vulcan, but this is what uh I'm doing right now in my engine. Maybe I'll adjust it uh as
24:06
I progress on my next game. Uh, but it's working for me for now, and I think it is a much easier way than I see a lot of
24:13
people get started with Vulcan. Uh, so we're going to go over my high level strategy. Uh, I'm going to talk about
24:18
like two lenses through which you could you could you could look at this kind of stuff. So one is you could look at like writing your renderer as like
24:23
programming the API, right? You could say, okay, what is Vulcan capable of and
24:29
how do I use that to accomplish my goals? Uh, and you could write your renderer that way, but the lens that I'm
24:34
preferring is what is the hardware capable of? How can I use it to accomplish my goals? And then okay,
24:39
great. How does how does Vulcan let me tell the hardware to do that? Um, it's
24:45
not that different, right? But but uh but I think it results in some better decisions and and and simpler outcomes
24:51
and we'll probably see why in these next few uh stages. So first let's talk about memory
24:57
management. Uh so think about you know instead of instead of looking at it like okay what what functions does Vulcan
25:04
have for memory management and when should I call which uh let's think about like well why why are we doing memory
25:09
management on the GPU? What are we even using VRAM for? Right? Uh and probably uh we're using it to store things like
25:15
level data, right? Uh we probably have uh a bunch of meshes and textures and
25:21
stuff uh for our level that are say we're doing a level based game. We might not be. We can apply this wherever, but
25:27
you know, we probably have a bunch of, you know, meshes and textures and stuff for our level uh that we need to be on the GPU so we can render them quickly.
25:34
Well, we could just do one big allocation at the start of the level when you load it or one per memory type
25:40
that we need and then just stick all of the data in that allocation, right? Uh
25:46
and then when you leave the level, you uh you just free it. Oops. Go back here. Uh and you know, it's it's
25:54
it's uh it's what you're probably doing in your engine anyway, right? like probably your actual CPU code isn't like
26:01
loading individual files and dropping them all the time. You probably when the level loads like put a bunch of I mean
26:07
hopefully you put a bunch of data into one big contiguous buffer because it's it's all going to be cache coherent. It's going to be easy. You don't have to
26:13
worry about like constructing and destructing individual objects. You're just like look that's the memory for the level and it's gone when the level's
26:18
over. There's no reason the GPU has to be any different. Um, so yeah, I mean I
26:24
think if you were to look at this from an API perspective, it would be, you know, and you're used to OpenGL, it would be really tempting to sort of
26:30
recreate OpenGL, right? You'd be tempted to say, okay, geez, you know, it's really tough that there's no heap
26:36
allocator. Uh, I'm going to make my own heap allocator for Vulcan. Now that I've made my heap allocator for Vulcan, let
26:41
me add the reference counting in like OpenGL had. Now you've just kind of remade it worse OpenGL, right? And you've kept all the complexity. On the
26:47
flip side, you could just take advantage of the fact that it's low level. you say, "Great, I only really wanted to load stuff when the level started, so put it all in one big buffer." Uh, very
26:54
small amount of code, easy to debug, easy to maintain. Um, and you can adapt this to other style games. You maybe
27:00
have an open world game. Great. Divide it into chunks. Each chunk is a buffer. Load the chunk when you when it appears
27:05
on screen and free it when you no longer need it. Um, so yeah, memory management doesn't have to be complicated. There
27:11
are a lot of details you got to get right. Um, I think the speaker notes where I first mentioned memory management have some interesting stuff.
27:17
A lot of the tutorials cover this stuff. Um, so yeah, you definitely got to like learn what all the different flags do. Uh, but uh, you know, you you get the
27:25
stuff right once and then you can just stick all your memory in a buffer and and you're good to go. Let's talk about uh, vertex data. Um, it's pretty common
27:33
in OpenGL, right, to have a bunch of different models with different like vertex layouts. Uh, maybe you have some
27:39
models that are completely static and all they have are, you know, you have pervertex, you have positions for the
27:45
vertex and maybe UV coordinates. Maybe you're doing some like skinned models and you pass in some extra data to help
27:50
with the skinning. Um, and and you end up having to tell the API about all these different uh these different
27:57
vertex layouts and you have to change them in and out. Uh, and you could do all of that in Vulcan. Uh, but uh, you
28:03
know, vertices are just data. We could just put them all in a giant buffer and
28:08
read from that buffer in the shader. And you could do this in OpenGL too. Actually, there's nothing nothing stopping you. Um so the point though is
28:15
that we could completely sidestep that entire part of the Vulcan API. Uh we just say look every single model in this
28:20
level goes in one giant buffer all in a row and the vertex shader just indexes that buffer when it's time to draw
28:25
stuff. Uh this is called uh vertex pulling. You might think that that would be slow. It turns out uh that uh you
28:33
know do your own research profile and stuff but my understanding is that on desktop hardware this is basically what's happening under the hood anyway
28:38
and there's not really any advantage to the vertex buffers in most cases. uh you do still want to use index buffers
28:44
because they allow for some optimizations that uh wouldn't be possible otherwise. Um but you don't need to create all these different uh
28:51
vertex object types and stuff. Uh you can just put them all in a buffer and you're good. Also on top of sidest stepping
28:58
dealing with that whole part of the API, you get a lot of flexibility in potential performance. Uh because now
29:04
you have the option to pack your vertex data in creative ways or even just to generate it, right? Because if you are
29:10
if your vertex shader just gets a vertex in instance, it could go index into your
29:15
buffer to get the vertex positions and data. It could also just generate them, right? Like really you just tell it like
29:21
draw 100 vertices and then it could just do some math and decide where they are. You don't even need a model in the first place sometimes. So by by using vertex
29:28
pulling you just get a lot of flexibility to uh kind of just be creative with uh where your vertex data
29:34
is coming from. Okay, uniforms uh in OpenGL, right? You
29:41
often are calling like GL uniform postfix to set basically a shader parameter. You know, maybe you're
29:47
drawing an object and it's and you want to make it red. So you you first call GL uniform color, you know, your red color
29:53
and then like you draw the object and then when you draw it again when you make it green, you call it gluniform to change it to green. You can do this sort
30:00
of thing in Vulcan 2. It's a little bit more complicated. You got to learn about descriptor sets. Uh descriptor set
30:05
layouts, pipeline layouts. Uh and if you're writing Vulcan, you should learn how this stuff works. It's it's it's worth understanding the the details. Um
30:12
and you could end up building a system that is very similar to OpenGL. It's going to be more code. It's going to be more complicated, but you can do it and
30:17
it'll be fast. Um alternatively though, uh you could just put all of your uniforms for the entire scene into one
30:24
big buffer and then index it in your shader just like we did with the vertices.
30:29
Uh, and turns out, uh, that that's like, you know, pretty fast and really simple.
30:34
And now you don't need to use a lot of the API. I mean, you still need a descriptor set, but you just need one. Um, and this is this is going to also
30:42
let us do a bunch of cool stuff on the following slides that wouldn't be possible otherwise. Uh, this is uh
30:47
called uh bindless rendering. The idea is that instead of having to bind a bunch of stuff in between each draw, you
30:54
just say like, look, here's all my parameters for the entire scene in one big buffer. And now you don't need to call, now you don't need to bind things
31:00
all the time. You don't need to call into the the driver, which means that you don't have this this point of potentially variable performance, right?
31:06
Because when you call into the driver, that's not your code. You don't know exactly what it's going to do, and it might be slow. Uh if you just put
31:13
everything into a buffer, then that's all your code. You have a memory mapped buffer. You just write to it in your
31:18
language of choice. Uh and you know, if you're working with other people, they don't even need to be graphics people.
31:24
You just tell them, look, this is the layout for the buffer. You don't need to know Vulcan. you don't need to know rendering. You just need to know what
31:29
type of data is supposed to go in this buffer that's going to be rendered and then any code in your engine can just write to that buffer. Uh and so yeah, I
31:37
mean all of your per object configuration just stick it in a buffer and it sidesteps all of that complexity
31:43
as well. Uh let's look at uh draw call generation. Uh so a common thing when
31:51
you're writing older APIs is to try to kind of optimize for less draw calls, right? It's like you can only draw so
31:57
many things. Every time you new draw, every time you do a new draw call, it uh you know takes some more performance. Um
32:04
forget how I laid the slide out. Uh oh yeah, so you what you used to do in the older
32:09
API, what you had to do, uh I think they got extensions for this stuff too. Uh but what you used to do is just one draw
32:14
call per object. Maybe use instancing if you have a bunch of the same object. Um but you can actually use a a feature
32:20
called draw indirect to also avoid this. Uh you're going to see a pattern here. It seems like almost always my answer to
32:26
everything is fill a buffer with data and send it to the GPU because that's what we're trying to do, right? Like that's what the job of the uh GPU API
32:33
is. It's to communicate stuff to the GPU. So yeah, in for draw call generation, we aren't going to draw individual objects. We are again going
32:40
to put all of the draw arguments into one big buffer and then tell the GPU that's what we want you to draw. This is called a draw indirect. Um so the idea
32:47
is when you draw an object, you typically pass in a few arguments like what vertex to start at, what vertex index to end at, a couple other things.
32:54
uh you just put all of the arguments that would go into that function into one big buffer and then you call draw once and it will just iterate through
33:00
the buffer and draw all of those things. Um this is great in terms of again you
33:06
can just any part of your code can do this. It doesn't need to interact with the GPU API if you're supporting multiple different GPU APIs.
33:14
This part's the same. Uh you don't need to abstract anything away. But also like
33:19
you can do this from multiple threads. You can even do this from the GPU. Uh so you know there's there's cool fancy
33:25
things people do where they'll do like occlusion calling on the GPU and then they'll emit like which objects they want to draw into this list from a
33:31
compute shader. Uh this is like a super flexible way to do rendering and in my opinion it's a lot easier. You just put
33:38
all the data in a buffer, send it to the GPU, hit draw. Okay, let's get to uh materials. Uh so
33:46
again this is a a source of potential complexity, right? uh we could say, okay, every new material is going to get
33:52
a new pipeline object, right? So, we're going to have like a new shader that has new logic for that material and then a
33:58
bunch of new configuration object options. And then we're also going to have to like sort our draw calls, right? Because we don't want to be changing
34:04
shaders all the time. So, okay, we sort all our draw calls by material type and then we like issue one draw and direct
34:09
call for every material. Uh and then like, you know, you can totally do that. There's nothing like wrong with it. Uh
34:17
but there's another option uh because we already sort of have materials with what
34:22
we've described so far, right? We were just talking about uniforms and how you can put all of your arguments for like a shader into a buffer. Um there's no
34:29
reason we can't use that for materials. You can just put a giant switch statement in your shader where you
34:34
switch on a material index and then you just run different code depending on which uh which which material you're on.
34:41
So if you're on a shiny object, you put the shiny index and it runs that function. And if you're on a mat object, it runs the mat function. And I know
34:47
what you're thinking. Conditionals in my shaders. Stack
34:53
Overflow told me that was slow. This guy doesn't know what he's talking about. I'm not going to pay attention to the
34:58
rest of this talk. Uh but it's important to understand like why are conditionals slow. Um, and the the the gist of it is
35:06
that uh you know GPUs are giant SIMD machines and uh it they want to run the
35:12
same instruction on multiple data sets. If you have a conditional and this pixel
35:17
runs this side of the conditional and the pixel next to it runs the other side of the conditional, well, that's not
35:24
really a thing it can do. So, it's going to end up stalling while it like waits for one of them to finish. And then if
35:29
it's like an else, maybe also have to stall again while it waits for the other one to finish. that slow. You don't want that. But we're talking about materials,
35:37
right? And materials, we're talking about like per object materials, which means you're always taking the same side
35:43
of the conditional within an object, which means you never have to stall. So in the case of, uh, this approach, this
35:49
uber shader approach, um, uh, which is what people typically call it, uh, it's not a conditional is just a conditional.
35:56
Like, is there a cost to it? Yeah, there's a cost to a conditional, but like it's not the when when you read
36:02
people saying don't put conditionals in your shader, they're going to make everything slow. This is not what Well,
36:07
it's either not what they're talking about or they're just wrong. Um, so yeah, you can totally just put a giant
36:12
switch statement in your shader and switch on the materials and now you just have a single pipeline for your entire game, which also sidesteps all the
36:18
complexity of having to like cache pipeline objects so that you don't like, you know, between runs so that your load times are faster because you just got
36:24
the one anyway. Um, so yeah,
36:29
uh, some details here, uh, on how I do materials. Uh, don't have to do it this way. There's many ways to do it. Um, so
36:36
this is a little bit of pseudo code because GLSL doesn't actually have U16s, unfortunately. Uh, I, you know, I pack
36:41
the U16s into a U32, but the gist of it is that each object that I draw has, uh,
36:46
a material index, which is, that's the thing we're going to switch on to decide which function to run. uh it has a 16 it
36:53
has 16 bits of just miscellaneous material data that's interpreted per material. So it might be that one
36:59
material uses it for a completely different thing than another. It's just up to the function that draws that material to decide what to do with it.
37:05
And then it has a second piece of miscellaneous data that's 32 bits which could have whatever in it, but often
37:12
you're going to need more data like a matrix or something in which case this will actually be treated as an index into an a separate uh separate area of
37:19
memory that has larger data. Uh and the reason I do that is because uh you want to pack all this data tightly. If you
37:25
tried to put all of your material data in a single material instance, you'd have to make it big enough for the worst
37:31
possible case. and then all of your simple objects that need very few configuration parameters would still
37:36
take up all of that space with padding which would be a little goofy. Um so the idea uh this is just like stuff from uh
37:44
Andrew's uh data oriented uh practical data oriented design talk. It's like the same sorts of ideas here. You can do all that stuff on the GPU too. Uh so yeah,
37:51
the idea is you put some data per object in a list of these uh material instance
37:56
strrus and then whenever you need more data that doesn't fit, you just treat one of these as a pointer into some other buffer uh where you store that
38:02
extra data. That little those two lines at the bottom there are just a little useful trick if you happen to be writing
38:07
GLSL. GLSL doesn't actually support unions, but you can kind of accomplish the similar thing by binding multiple
38:14
variables to the uh same binding number and then they'll just get reinterpreted as uh whatever type you list them as. Uh
38:21
my understanding is this is like fully allowed uh you know it it's and it's very very convenient. You do have to
38:27
make sure you do your math right so that the index is actually the correct index, right? Uh or to store bite offsets or
38:32
something. Uh but yeah, this lets you store all of your data very compactly and get it to the GPU very very
38:38
efficiently. So yeah, I mean that's that's the
38:43
summary of my rendering approach. Uh it's uh I mean that it's it's pretty straightforward, right? You uh you
38:50
generate a buffer of all of your like you know which vertices you want to draw just like the start and end indices. uh
38:56
you generate a buffer of all of the per object configuration and then you generate that separate buffer for data
39:02
that didn't fit in the previous one for for bigger material data and then you issue one draw call for your entire
39:08
scene that then just goes through and draws all of the objects. Um, so yeah, we just fill up a few buffers, uh,
39:14
memory map buffers, and then we tell the GPU go and and it does. Uh, and it's,
39:20
you know, it's really not that much code. Uh, you know, that that I I should have maybe like counted before I came here so I could like tell you the
39:26
number, but you know, that triangle example was maybe, you know, it was 1,100 lines or whatever. I'm pretty sure
39:31
my renderer with all of these features is, I don't know, it's like two or three thousand lines of code. And it has all
39:36
this flexibility, right? Like it has like it has it has a full material system. you can like fill in different different logic for different objects.
39:43
Uh it has uh it has I didn't mention it. There's a few things I skipped. Uh but
39:49
uh it has instancing. It has all kinds of stuff. Uh and it's just you know it's just not really that much code because
39:55
ultimately all we're doing is filling buffers, sending them to the GPU and drawing them. Uh so what does this get
40:00
us? Uh minimal driver overhead for one. Uh all the uh binding of resources has a cost. All of the uh individual draw
40:06
calls have a cost. um minimal API surface area which just makes my life easier as one person uh writing a
40:13
renderer and when I want to port to other APIs it makes my life easier right because when I want to port to DirectX
40:18
or if I want to port to uh whatever the NVD API on switches or something right like there's less stuff I have to
40:23
implement I just have to make sure I can get my buffers to the uh to the other device um it gives us the flexibility to
40:29
generate draw uh draw data from multiple threads or from the GPU uh so uh I didn't really go over it but uh in in my
40:36
engine right now I I actually do issue more than one draw call, but only because I have a thread pool and I do this work simultaneously in all threads.
40:41
So like you know each thread pool task issues a draw call just because hey why not why not do it all in parallel. Um
40:48
and you know if you're working with OpenGL trying to do stuff in a like multi-threaded fashion is like a little tricky sometimes. Uh OpenGL isn't really
40:56
exactly designed with that in mind. Uh or at least wasn't from the beginning. There's some ways to make it work. Uh
41:01
here it's just like not a big deal. You're not even Don't get me wrong, Vulcan does let you do multi-threaded stuff. could use a more like APIcentric
41:08
approach where you're like working with more parts of the API and do it multi-threaded in Vulcan. It is very
41:13
well designed to allow for that, but like you don't even need to think about any of that. Like it's just all the normal rules you're used to for
41:19
multi-threading because you're just filling up your own buffers with data and then uh and then kind of all in one
41:25
go telling it what to do with them. Uh so yeah, we did skip over a few things. I didn't talk about instancing
41:32
and like yeah, you know, in your actual game you might have extra passes for post-processing and stuff and so yeah, you probably end up with a couple more
41:37
pipelines than just the one. Um, but it none of this stuff really changes the overall strategy. Like you can you can
41:44
you can use this approach and kind of add on these extra features.
41:49
So, okay, this all sounds great. Uh, is it actually fast though, right? like am
41:55
I just pitching you a simpler way to use Vulcan that is also slower and means you should have just used OpenGL to begin
42:00
with. Uh I'm going to show you a little demo here. Start out with just a screenshot. Nothing impressive here, but I want to explain what's happening. Um
42:07
so here we have a uh bunch of individual game objects on the screen. It's not
42:13
like a particle system demo. If I was going to do a particle system, I would just do it fully on the GPU and put a hundred bajillion particles and it would
42:19
go fast. But the point of this here is that uh I have a bunch of different like game objects. So if you have an entity component system, these are your
42:25
entities. Whatever your way of organizing your game objects are, this is them and they exist on the CPU and I'm going to send them from the CPU to
42:31
the GPU every frame. So all of the I'm not caching any data. All of the uh all of the colors, the positions, and
42:36
everything getting sent every time. Uh and I'm running this on like I like a six-year-old laptop or something. And so I want to see like how many objects can
42:44
I render using this approach uh and still hit 60 Hz on this 60-y old laptop
42:50
I'm testing on. uh you know and they all potentially have different materials and everything too. Okay, so we started out
42:56
with 100. This is uh is it h it's not showing up there for some reason. Uh
43:02
supposed to have a slide with a picture with some more dots. Ah, there it is. Uh so okay, I think this is I think this is
43:07
a thousand. Still getting 60 Hz. Uh okay, we got uh we got 10,000. Um here I
43:14
got a question. Uh how many pixels do you think there are on like a 1080p screen? It's like it's like what? Like two
43:21
million. Okay, it. Let's do two million. Can we get that Can we get that video on here?
43:31
Okay. Yeah. So, I mean, I don't know. Does your game need to render more objects than there are pixels on the screen?
43:37
No, I didn't think so.
43:46
So, so like you know, obviously this is like a goofy demo. The objects are all very simple, but but the the point is
43:51
that the cost of getting the data to the GPU was cheap. Obviously, if there was two million objects with fancy
43:56
reflections on them, sure, it wouldn't it wouldn't run 60 fps, but that's fine. That's your shader code that's responsible for like the fancy effects.
44:03
The point is that like we're trying to figure out today, how do we use Vulcan to get the data to the GPU so that we
44:09
can run it? And clearly if if you can draw if this approach can draw two million objects at 60 fps on a
44:16
six-year-old laptop uh then yes it's totally fine. This is a totally
44:22
realistic pragmatic way to get to the data to the GPU and then you can do all the normal optimizations you would do on
44:28
the shader side to make your shaders fast and also like you can just not draw two million objects and you can cache
44:34
data and like you know if you're doing a particle system don't do it this way just do it in a compute shader. Uh uh
44:40
but the point is that's possible. So this works. This is a totally viable way
44:46
to render your entire game and and lots of games do use this kind of approach.
44:52
Okay. So so what makes a good API? Uh well you know this lots of people have different opinions about this. Um, not
44:58
trying to say mine is the only correct way of looking at it. But, uh, I think that like going through writing this graphics API
45:04
abstraction and learning Vulcan has kind of made me think of APIs as just a way of communicating with an underlying system. You know, as a low-level person,
45:11
often that underlying system is hardware, but it doesn't have to be. The underlying system could be another software system. Uh, and I think, you
45:17
know, it's not the only way to look at it, but I think that, uh, it's useful to think of APIs as being like
45:23
well-designed when they facilitate communication with an underlying system and poorly designed when they obstruct
45:28
it. Uh, and as a consequence of that, if you have an API that's doing a good job
45:33
at like letting you communicate to the underlying system, but you're still not you're still having trouble completing your goals, then it's probably not the
45:40
API that's at fault, right? It's probably either the underlying system or it's that your goals aren't meshing well
45:46
with that underlying system. Uh so, you know, Vulcan definitely isn't a perfect
45:51
API. I I have nitpicks and criticisms. Uh but I don't think being low-level is
45:58
an imperfection. I I think the low-levelness of Vulcan is a is a huge improvement. And I think that uh I think
46:05
that going forward, my renderers, despite having many more lines of code, uh are going to be much simpler.
46:13
Uh so yeah, that's kind of my takeaway. Uh maybe think about designing for hardware instead of for APIs. Um so I
46:20
got some useful links here. Uh if you are doing Vulcan, you probably often will need to look up uh info on
46:26
different GPUs. Uh vulcanpuinfo.org is a super useful resource. Um yeah. Oh
46:32
yeah, the QR code here also links to the slides. So if you want to go back and look at those speaker notes for anything, you can. Um yeah, Vulcan GPU
46:38
info.org or let you look up like how many GPUs support such and such feature that I'm considering using. What type of
46:44
memory does this GPU have? What type of memory does that GPU have? Uh uh the Steam hardware survey linked below that
46:50
is also great. Uh it's a good way of figuring out like which GPUs are are players using in practice if you're shipping games on uh desktop. Uh some
46:58
Vulcan tutorials. Uh I wouldn't like necessarily ship the tutorial code, but they're great ways to get uh to get started understanding the APIs. And then
47:05
some useful tools. probably all know about them, but if you don't, you should try out render doc. You should try out Tracy Darham Guey is great. And if
47:11
you're using Zigg, uh the Snron's Vulcan Zig wrapper is really nice. Um yeah, so
47:17
that's it for me. Uh if you want to support me, great way to do that is picking up my game way of Freya, which is on sale right now on Steam. Uh also
47:24
have copies in person. If you're interested in more of this kind of stuff from me, I have some talks online uh at
47:30
that address, uh newsletter, stuff like that. got a GitHub linked there where
47:35
the graphics API abstraction I'm working on will be public uh soon. And then I
47:40
have a bunch of other little game tools and stuff and wrappers for existing game tools that that I'm putting up there. Um
47:46
yeah, that's all I got for now. Uh happy to I think we got a bunch of time for questions.
48:00
You can sit down on that chair right there. Get comfortable. All right.
48:08
I take privilege for the first question. Sounds good. On Wow. You've I have a lot of a lot of
48:16
thoughts on this. I might have to ping you even like after this talk, but cuz I I told you I was I'm writing my own terminal emulator and I'm and I started
48:22
using Vulcan then switched to So once I met these barriers about barriers.
48:27
Yeah. Once you once you literally met the barriers and you're like, "Gez, what the heck is that?" Yes. But you mentioned uniforms, which
48:34
is true. I always had that feeling of if I'm updating shaders
48:39
by calling GL uniform once per frame like I'm updating a color. Yeah, exactly. Some value and I'm doing this matrix transforms or whatever. I always
48:45
felt like that was like doing that one per frame felt like a lot of overhead. It felt a little wasteful. So, I did like that about Vulcan that you have
48:51
this uniform. I didn't I never even went to descriptor sets. I went to uniform buffer objects right away. You know,
48:56
it's just a large addressable memory, huge bite rates. Love that. Right. And you can do even partial memory rights and you can decide when to flush it. I
49:03
think you told me you can decide when to allocate uh even I love that control for sure. And that's that's a there's that's
49:09
simple for the driver for sure because now it's all passed to you. I like that
49:14
simplicity. But what would you say about like even then though like would you
49:19
recommend somebody like myself on the just 2D side because terminal is just text, right? It's just it's just some
49:25
some some text and some icons. I'm pushing 2D quads. Would you still recommend I try and give Boken a chance
49:31
again or what's your judgment call on that? Because I would still have to do some work to to gro everything, right?
49:37
Yeah. So, it's tricky. Um, I mean, I would say like, you know, you have a project you're trying to ship and if what you're doing is working for you,
49:43
then you know, no reason to burn a bunch of time undoing it and trying something else. Um, fair. But uh but I would say like uh to
49:51
someone in the game space, I might more I might be a little not not that you couldn't benefit from Vulcan.
49:57
Sure. But uh but I might kind of push someone doing games a little more strongly to do to try using the modern APIs. And the
50:03
reason being that like you know I worked with OpenGL for a long time and you know a lot of these
50:10
techniques I'm describing like you you could do an OpenGL but I never realized because it's uh it's very hard to learn
50:18
what's actually going on when it's it's kind of like if you like learn programming using Python. Uh and hey no
50:24
no no shade to anyone learning programming using Python. Uh in fact it's it's probably no like in fact it's
50:29
probably a good way to learn. Uh, but there comes a time when you're like, okay, like I kind of want to know what the computer's doing and like I mean
50:36
you're not going to be able to figure it out. It's like even even though it's like a easier language to do some things in, like you're not going to understand
50:42
what's happening under the hood because it's not like exposed to you. And like and then and then the thing is if you
50:47
ask a low-level programmer to write Python, they can actually probably make it pretty fast because they know how it's actually working. And so if you
50:53
asked me to write OpenGL right now, I probably could make that demo f like the same speed in OpenGL. But if you had asked me to make that demo in OpenGL
51:01
with no compute shaders or anything uh prior to messing with Vulcan, I don't think I would have succeeded at it
51:06
because I just really had no idea what was happening. So if you're if you're someone who like GPU programming is
51:11
going to be important to you going forward, then I would say like I really recommend giving this stuff a shot because it'll make stuff make more
51:16
sense. In your case, like you have a thing that you need to get done and it's just a piece of the puzzle. So like, you
51:23
know, whatever. If it seems like fun and you have the time to kill on it, I'm happy to answer your your Vulcan questions. If you decide, look, I
51:28
Oh, if I have an ear with you, then I might I might give it a shot again. Yeah. Yeah. Cool. I would like the chat
51:35
to um answer this question for me, too. And I know you can answer it, but I want to ask other questions for from the audience. I'm curious about Vulcan
51:42
extensions and I know ever since I did this Vulcan attempt there's been a lot more extensions that to the point that
51:48
almost feels like OpenGL if you really want to with all these layers of there's validation layers there's extensions
51:54
like can I is that really true and is that worth it right like to even do you want sounds like you want to say something about that yeah
52:00
um yeah yeah yeah go for it so I would say like uh first of all like a few different things one one thing it's only slightly tangential but I but I forgot
52:06
to mention it earlier if you're doing Vulcan like turn on the validation layers you know like an OpenGL If you make a mistake, you can call like gl get
52:12
error and like kind of find out about it. In Vulcan, it's like you can you can turn on these layers where it'll just like take all your Vulcan calls when
52:18
you're in like a debug build and filter them through a separate layer which like will check all it'll check like you know
52:24
you pass the right parameters and stuff but also like it'll check like you know for like just common mistakes that are
52:30
easy to make that like would be maybe hard to
52:35
I can't think of a good example right now. But they're going to catch problems and they're going to tell you exactly what you did wrong and then you're going to go oh wow I'm really glad I turned
52:41
that validation layer on. I was gonna end up with crazy flickering bugs I couldn't figure out. But the validation layer say, "Hey, don't do that. You
52:46
know, your synchronization's wrong." And you're like, "Oh, yeah, you're right. Actually, it is." Um, but yeah, in terms of like extensions and stuff, um, there's two
52:53
things I'd recommend. Uh, uh, there are a bunch of useful extensions. Uh, if you're trying to ship something that'll like work on a bunch of computers, I
52:59
recommend looking at that uh, database that I linked because it'll tell you like uh, you know, what the support is like. Though, consider that some entries
53:05
in the database may just be from before the extension was released or something. So it might say they don't support it, but then if you look at like a newer
53:11
sample from the same GPU, maybe it will support it. So the percentages might be a little wonky. Um, but you look at
53:16
what's supported. And then another trick is that uh I just look at what DirectX can do even though I'm writing Vulcan
53:22
because DirectX is like like Vulcan is trying to support a lot more devices than DirectX. So there's a lot of
53:27
features that are optional or that are only available through extensions whereas DirectX is like, okay, it's like
53:32
Windows and Xbox. So, if I look at something and I'm like, DirectX has that capability built in, then I pretty much
53:38
just assume, okay, you know, probably all of my target machines have the Vulcan extension, too. That's not 100%,
53:44
but like it's a it's a good jumping like a good way to like make a judgment call of like is it worth looking into? Because if you look into like DirectX
53:50
doesn't have it, then you're like, okay, well, maybe this is like a new fancy thing that I can't really use. Let's try and do five minutes of chat
53:56
questions before we do the bathroom break. Okay, so first question. uh these approaches like you know make one buffer
54:03
and send it to the GPU that implies that there is now special logic on the GPU to
54:09
unpack these buffers. Yeah. Is that unpacking code hard to optimize? Like if it's arbitrary shader code, how
54:17
does the GPU do these things quickly when it can't have hardware built specifically for these operations?
54:23
Yeah. So, so I'm definitely not um I'm definitely not like an expert on like low-level workings of GPU. Uh, and uh, I
54:32
meant to thank uh, Martin for uh, he's I can't see because of the lights all right up there because he because he answers all my my GPU questions. Uh, but
54:40
uh, uh, my understanding when it comes to and you maybe someone who's an expert
54:46
can correct me if I'm wrong, but my understanding when it comes to like unpacking like vertex data is that like when you do it manually in your shader,
54:52
it ends up being basically what the GPU would have done anyway. like like it's sure you you you told it to do that from
54:58
the API but I don't think it's actually like taking a different path uh on desktop maybe that may not be true
55:03
everywhere um uh in terms of other data like uh you know uniform data and stuff um
55:12
I think I think it just like it depends so much on like exactly what you're doing that I don't have like a general answer but I would say that like if you
55:18
know if you're taking a 32-bit integer and you're splitting it into two 16- bit integers like that's that's not an expensive operation um and like uh so
55:25
Yeah, like you know, the more clever your encoding gets, the you know, the the more complicated it's
55:31
going to be. But, uh, I'm not none of the encodings I'm using so far are really all that involved.
55:37
In the interest of time, I'm curating some questions in case you see that I'm going out of order. This question is from one other one. Um, one thing to consider
55:44
when you're doing this stuff is whether or not it would be trivially possible for the GPU to prove that what you're
55:49
doing is consistent across the whole object. So for example um if you uh if
55:55
you're using something to like index an array but you're deriving it from like
56:00
the base instance of like like the the what's it GL base instance or something it's like the GPU knows or the you know
56:07
that that cannot vary within like the draw and so presumably it would be trivial for it to figure out okay we can
56:13
reuse that result for the entire draw. I don't know enough about the low-level details to know like when that is and isn't done. I'm under the impression
56:18
it's done sometimes. So yeah, like uh all I I designed all of the unpacking to
56:24
be presumably trivial for it to figure out that yes, the entire object will get the same result here. Uh so yeah, that's
56:31
not hard to do though because you're usually doing it based on your like your your uh instance index which you know.
56:38
Next question by res Mason. Do you have any thoughts on WebGPU or the WebGPU
56:43
implementation for indie development? Um, these techniques you showed look as feasible on web GPU as Vulcan or the
56:51
others. Ah, fellow Mason. Um, uh, yeah, I mean it it seems fine.
56:57
I'm not like uh it wasn't like I'm trying to figure out how to put this. Um, all these APIs seem
57:03
really similar. I I I think that when I was looking at it, there was like maybe it was like uh indirect drawing or
57:10
something that like wasn't fully supported or I feel like there was like one of these features that like wasn't fully supported there and it felt a little silly because I was pretty sure
57:16
that the underlying hardware all supported it. Yeah. Um I'm personally not using WebGPU just because I don't I don't ship on web and
57:22
and I understand that WebGPU works on desktop. Uh but like every platform that
57:27
WebGPU works on desktop Vulcan also works on and it's like what the like uh
57:33
you know all of my graphics debuggers work with Vulcan. It's what the like uh graphics card vendors are like giving
57:38
advice about directly. Um and like I don't know I don't want to go through Google stuff. Um
57:44
uh and I don't want to do text shaders. Like I don't understand why they had to do text shaders. I think maybe you don't have to anymore. Maybe they walked that
57:50
back. Um but I think it's perfectly valuable. like uh you know all these APIs are very very similar to each
57:55
other. It just it's just I I do get a lot of people asking me like why make an abstraction layer when you could just use WebGPU and I think that's a little
58:01
misguided because it doesn't it doesn't like I need to work on PC and console.
58:07
If I just needed PC I would just do Vulcan and then I would use Molten VK for Mac. I'm not even shipping on Mac
58:12
anyway actually. Uh and if I want to do consoles then well I'm not I mean I'm going to need to do something else besides web GPU or Vulcan or whatever.
58:19
So uh but but it it works. You could totally just use it and do most of the same stuff.
58:25
This one comes from Red Line. Is there a Vulcan tutorial that isn't an objectoriented monstrosity?
58:32
Yeah, not that I'm aware of, unfortunately. It's uh it's it's a little frustrating
58:37
because I I I think that uh half the time I spend when I'm reading like a a
58:42
tutorial for like a new Vulcan feature I don't know, is just trying to untangle the uh the
58:49
they built like abstractions for it like up front that aren't the abstractions that I want and I have to
58:55
like undo them to understand what's happening and then um so yeah sorry I don't have any good recommendations
59:01
there. I personally rewrote every example from vulcan-tutorial.com
59:06
into vanilla C from C++ and C++ 17. Not easy. My my my recommendation would
59:13
be uh would be yeah un undo all the object-oriented stuff but don't don't
59:19
try to replace it right away because it's too hard at the beginning to understand like it's too hard to see the
59:25
forest for the trees until you've like gotten the triangle working. So like I wouldn't try I would build like zero abstractions and I would like just get
59:31
the tri even when they tell you like you should build an abstraction for this like don't do it. get to the point where you have the uh get to the point where
59:38
you have the triangle and then that's your time to go, okay, I have all of the
59:43
parts now I can sit down and think like how can I actually chunk this up in a meaningful way and I think you'll do a
59:48
much better job at that point than if you try to do it from the from the beginning. The final lesson in that
59:54
tutorial, one of the final lessons I think is to import some 3D model like some OB obj model. And I remember I was
59:59
like, "Oh, it's just, you know, some OBJ models have duplicate vertices in the data. So you can just dduplicate by
1:00:04
doing a me compare. You loop me compare and discard." I was like, "Oh, that's exactly what they're going to do,
1:00:09
right?" No, they import an unordered map. They do custom template specializations with
1:00:15
hash functions so you can index into vertices as an array. No. What? I just want to render. Sorry.
1:00:21
Oh, no. No. Well, what what kills me is like I don't your game shouldn't even be doing any of that, right? Like like convert the data to uh a format that you
1:00:28
can just upload and do no processing. And don't get me wrong, like you do need to at some point do the dduplication and
1:00:35
and yes, it should be in an offline tool. And no, the beginner trying to learn the API for the first time doesn't need to learn how to write that tool.
1:00:40
But that's exactly my point. Like if I was to write an a tutorial for Vulcan, and I'm not like hating on the people who writes tutorials. I'm very grateful
1:00:46
for that to them because hey, that's how I learned Vulcan. Uh, but if I was going to write a Vulcan tutorial, what I would do is I would give a model file in a
1:00:53
format like on a binary format where it's just literally the list of positions with no header and nothing else and be like, look, for the purposes
1:01:00
of this tutorial, just load this into memory as is and do zero processing and then maybe have a separate stage later
1:01:05
where I show the code for how I made that file. Uh, and then yeah, don't do it with hashmaps and stuff.
1:01:12
Um, couple more questions and and we'll conclude. And there are still more questions that I hope you can answer in
1:01:18
the chat later. Um, so from Seth, thank you for your question. Have you used SooL? Can you do bindless rendering and
1:01:24
vertx pulling there? Uh, I haven't used so. It looks great. Uh, at the time that
1:01:29
I looked at it, and this may not be true anymore. At the time I looked at it, it was like maybe not the last generation
1:01:34
of APIs, but like the in between one, and I wanted to be on like the absolute latest thing, so I was like, I'm just going to do it myself. But, uh, but it
1:01:40
looks really great. Um, I do not know whether or not you can use these techniques there. I would not be surprised in either direction.
1:01:48
Question from Mik. What was the alternative for the V? Oh, I think he's asking for something. What was the alternative for the Vulcan memory
1:01:55
allocator library? Oh yeah. Um so so uh yeah another place
1:02:00
where like uh that library looks great. Uh but uh yeah, I just uh I just make
1:02:05
like one big allocation like up front. So it's like uh well one per memory type. So, okay, the the simple version
1:02:11
is I make one big allocation, for example, on level load and then I just bump allocate into that because you
1:02:16
never really like free stuff, right? Like interesting. Like how often do you need to remove a texture from the GPU, right? Like like
1:02:23
an individual texture, right? Like you probably remove the whole level. Now, if you're doing some kind of streaming thing, then you do need to remove stuff,
1:02:28
but you could just remove a whole chunk or add a whole chunk. Um, if you really are doing something fancy where you're like, you know, having like uh streaming
1:02:36
in higher resolution versions of textures, you could just like leave room for them in that buffer. Um, so the idea
1:02:42
is that you don't need a heap allocator because you you know like like I I really don't like heap
1:02:48
allocators like in general like it's almost it's it's in a high level language it makes sense, but when you're doing lowle stuff, you almost always
1:02:54
know more about the lifetime of your objects than a heap allocator does. So you can almost always do better. And it
1:02:59
sounds like you're like doing more work then, but like the answer is almost always like a bump allocator where you just like add stuff like you say here
1:03:07
the first object starts here and then you just add the next one after and then the next one and they're all just next to each other and as long as you get the
1:03:12
alignment right and stuff like that's fine and it's so simple and it's so fast. Um so so that's what I do. The
1:03:18
wrinkles come in when first of all like the alignment restrictions can be can be different per GPU. So you like you do
1:03:24
have to do some work to get this right the first time. So you got to read about like what the alignment rules are. Um and there are also different memory
1:03:30
types. So it's not really just one allocation. You need to make a few allocation based on the different types of memory. What I recommend here is look
1:03:36
at the memory types that uh that directx uh exposes. Uh because it's a it's a
1:03:41
simpler way to kind of understand like like Vulcan will give you like 10 different flags. Um and essentially what
1:03:48
I ended up doing was just reimplementing the the same ones DirectX gives. So I have like an upload buffer, a readback
1:03:54
buffer, and then like a static buffer and then also like a uh base address register upload buffer for like smaller
1:03:59
faster uploads or big ones if it's resizable base adjust. Anyway, there's like three or four different like memory
1:04:05
types and then within that you might need different stuff for like vertexes versus images and stuff. So there's like details to get right, but the but the
1:04:10
point is that you make your big allocations up front and then you just bump allocate into them and then you don't have to worry about heap allocation and garbage collection and
1:04:16
everything. uh in the middle of your talk there was one comment I think it was from Ruben who said this is a your talk is a great
1:04:22
highle introduction to boneless rendering and now there's a whole running gag
1:04:27
there that I that I hope you enjoy after this after this talk well done Mason round of applause
1:04:33
of applause nobody but you could probably
1:04:38
I I was for the boneless joke not for myself yeah boneless rendering is when you do
1:04:44
like the ollie but you put your foot on the ground while rendering