How does your design for a 3D monitor work?
Last updated: Apr 30, 2011 (first version)
1 About this document
In scene 16 of my book, "The Legend of the 10 Elemental Masters", there is a 3D monitor present. This monitor displays everything in 3D. However, unlike the real world's 3D TVs, of which use light splitting or those strange glasses, this is something completely different but 15 years ahead of its time. Prototypes can be made today, though very limited in resolution. Knuckles, in scene 16, makes a remark involving cubes being used to create the 3D effects. I was thinking of something that suit the time, the distant future. Upon closer analysis, I realize that this kind of technology is entirely doable. This article explains the basic idea on how such a monitor could actually work in the real world, aside from a few problems that need to be worked out. At least, with this monitor, you can experience the 3D from any position or any angle without the need of those weird glasses. Also, another bonus that you can't get with either the 3D glasses or 3D television is that, if you move, the scene moves with it. Tree in the way? Move to the side and you can see what's behind it.
Do note that I am not mechanically oriented and I'm not a professional at physics. Also, I've never experienced a 3D film and it's been a good half decade since I last saw a hologram (as of the time I wrote this, April 30, 2011). My only experience with real 3D comes from real world objects. I've only recently heard of 3D televisions coming out, but I've never seen them.
2 Cubic basics
2.1 Cubic arrangement
How can depth be achived with a monitor? Unlike traditional 2D monitors that use pixels created by light-emitting diodes (LEDs), a 3D monitor would use a series of cubes. There will need to be a large quantity of these cubes though. To achieve high definition, 1920 of these cubes stacked 1080 tall and 1920 deep would form the array. This is a mind-boggling 3,981,312,000 cubes. The cubes, however, will be very small, approximately 250 micrometers (about 1/100 of an inch), about the same size as that of pixels on a typical monitor available at retail stores. This would make the display area 480 millimeters wide, 270 millimeters tall, and 480 millimeters back (18.9x10.6x18.9 inches, the eqivalent of a 21.7-inch monitor) The cubes would be joined together without any gaps between them and would need to have a red, green, and blue light emission source. I've heard of yellow being used, but I have no idea what effect this has on picture quality.
As this image shows, for a 5x3x3 arrangement, the cubes are very closely packed and each have their own red, green, and blue LEDs. Each color is spaced as far apart as possible and the LEDs are excessively big in this case. It only serves as an example though.
2.2 Cubic lighting
Getting the cubes to light up in a way to create the scene is one of the unknowns. I have thought of 3 methods to create true 3D scenes. One method involves having all cubes fully lit at all times and using an electrically conductive pure black plate that varies in opacity to adjust color values. The second method, a better method, is to have the cubes themselves light up at varying intensities. The third method, the best, involves having an array of electrically conductive plates, much like having hundreds of ultra thin TV screens in one, though varying in transparency.
To color a cube so that it outputs orange, the red light needs to be at maximum brightness and the green light needs to be at half brightness (after the RGB triplet of (255, 128, 0), in RGB order)). We'll use this as an example to explain the two methods.
2.2.1 Electrically conductive plate
Between each layer of cubes going progressively toward the back of the display is an electrically conductive plate. This plate is as dark of a black as possible when no electric current passes through it. However, should an electric current pass through it, the plate becomes transparent. The greater the electric current, the more transparent the plate becomes. When the plate is transparent, it allows light to get through. When it is opaque, it does not allow light to get through. When partially transparent, some of the light gets through. To reproduce the orange, the area around the red is fully transparent and the area around blue is fully opaque. The area around green is half transparent.
This method makes it easy to ensure that each cube gets the correct color. It does not work, however, when it comes to transparent objects. It uses a lot of energy since it requires that every cube's LEDs remain on at all times the monitor is turned on. Also, the wires in the plate can interfere with the scene, creating odd lines in the scene, obscuring the part that lies behind.
2.2.2 Light emitting cubes
The light emitting cubes (LEC) method is a pure array of cubes that emit light of varying intensities. It is a better method than the first though it's more difficult for drawing. Each cube is jointed together without any gap between them. There are no objects between them either, only the LEDs. When a cube needs to be colored orange, the red LED on that cube is at maximum intensity. The blue LED doesn't emit any light and the green LED only emits at half its maximum intensity. To control which cube gets which color, a short-range radio transmitter needs to be in the monitor, likely centered at the bottom. This radio transmitter has a range of about 2 meters (1 for smaller monitors).
This method requires considerably less energy, and allows for transparent objects. The main problem with it, however, is that it's difficult to control which cube gets which color without affecting the other cubes, due to the radio trasmitter.
2.2.3 An array of plates
The third method is the best of them all. It's kind of like the first, except that no cubes are used at all. Rather, it's like having several hundred ultra thin TV screens in one, of which vary in opacity. The more opaque the volumetric pixel (or voxel), the more opaque the voxel is. For parts of the scene where nothing is, that voxel is completely transparent. To otherwise reproduce the colors more accurately, the color values need to be multiplied by the opacity. Consider a thick piece of clean glass for example. It might have an ARGB color of (32, 64, 207, 223). To create this, that object is drawn with the plate roughly 1/8 opaque, but using the RGB color of (8, 26, 28) instead. It may seem extremely dark, but think of something like the ARGB color of (0, 255, 255, 255). You can't have an all-white light here for this voxel. With no light being emitted, you can see right through that portion of the scene onward to objects further beyond without affecting them. What about the ARGB color of (255, 0, 0, 0)? That's where the plate itself is fully opaque, but it doesn't emit any light, creating a black obstruction. Since this method allows for transparent objects, antialiasing is also possible.
This method has only the wire visibility issue to resolve that's present in the first case. All of the advantages of both methods are available through this method.
2.3 Drawing the scene
This section assumes the use of the array of plates method.
2.3.1 The coordinate system
Understanding the 3D coordinate system is essential to understanding how the scene gets drawn. Placing the origin at the frontmost bottom left corner makes it easy to understand. The X position increases as one goes to the right. The position (1919, 0, 0) would be the cube in the frontmost bottom right corner. The Y position increases as one goes higher. The position (0, 1079, 0) would be the cube in the frontmost top left corner. Lastly, the Z position increases as one goes deeper into the scene (after a video card's Z buffer, used to control the drawing order in 3D games). The position (0, 0, 1919) would be the cube in the backmost bottom left corner.
2.3.2 Timing and synchronization
Each cube needs to have a timing system of some sort to be in sync with the video frames. When the monitor is turned on, a series of synchronization tests must be performed. This is to line up the frequency of the video card's output to the cubes updating their colors. Every frame, there should be a time in which the monitor does a brief realignment so that, as soon as another frame comes in from the video card, the first cube gets updated immediately.
The order of the cubes is based on left to right, bottom to top, then front to back. This is based on an increasing array index position (in programming). The cube in the frontmost bottom left corner uses the first 3 bytes of data. The cube in the backmost top right corner uses the last 3 bytes of data.
Getting each specific voxel set to the correct color is done much the same way as standard LCD or LED screens. However, the 1920x1080 screen is drawn 1920 times, to provide for the depth effect.
Let's consider a simple object to draw - a pyramid with a square-shaped bottom and 45° slopes extending from the very bottom to the very top of the scene and centered within the scene. This makes the pyramid 1080 voxels on a side. The first 420 voxels in the front and left side as well as the last 420 voxels in the back and right side are fully transparent. The closest part of the pyramid, in the bottom left corner, will start out being 1/4 opaque. For a white pyramid, this would use the ARGB color of (64, 255, 255, 255) which makes the lighting based on being the RGB color (64, 64, 64), a dark gray. The next 1078 voxels to the right are 1/2 opaque, for (128, 255, 255, 255) becoming (128, 128, 128). The rightmost voxel is the same as the first one - 1/4 opaque. The first voxel behind the first voxel in this scene is the same as the second, half opaque. The voxel to the right, however, is fully opaque, for (255, 255, 255, 255) giving the RGB value of (255, 255, 255). The voxel directly above the first 1081 is fully transparent, for (0, 255, 255, 255) becoming (0, 0, 0). The next one is the same as the first. This, of course, assumes that the pyramid is equally lit on all sides. With lighting, white changes to darker shades of gray. To optimize drawing, the interior voxels of solid objects can be made completely transparent, using (0, 0, 0, 0) as the ARGB color.
3 The casing
To protect the thin plates, there must be a shield of shorts that fully encases it. This shield should be made of glass or plexiglass and placed right up against the display. Except for the front, all of the sides must be as dark of a black as possible, for optimum viewing results.
Various buttons would be included to turn the power on and off, adjust the volume, channel, or settings (such as brightness or color adjustment). One unique setting would be the scale of depth. A greater scale of depth makes further objects appear much further, giving a greater sense of depth for foreground objects. A smaller scale of depth makes further objects closer, giving distant objects a better sense of depth.
4 Problems to solve
There are a few things that I don't know to solve.
4.1 Workmanship uncertain
The first is whether or not this will even work in the first place. I have no way to build such a thing and I don't have mechanical or electrical knowledge on building such a device, let alone programming drivers. I haven't a clue as to how to get a simple light bulb to turn on by changing the value of a variable.
4.2 Extreme computer processing speed
The second and currently the biggest problem, provided the idea even works in the first place, is that today's computers cannot handle a data rate of even 477,279,682,560 bytes per second (for 32-bit color at the NTSC standard of 29.97 fps - computer monitors can easily get to 75 fps, a bit more than 2 1/2 times that (or 1,194,393,600,000 bytes per second)). Thus, early models will have to be much lower in resolution. Early prototypes might go with 160x120x160 instead (that's a data rate of 368,271,360 for TV, 921,600,000 for computer monitors) just for proof of concept. For comparison, 1080p HDTVs have a data rate of 372,874,752 bytes per second) and the largest of monitors, having 2560x1600 that I'm aware of, have a data rate of 1,228,800,000 bytes per second (for a 75 Hz refresh rate). Given the growth rate of computer processing power, doubling every 18 months, it'll be around the year 2025 before computers can handle the 1920x1080x1920 size mentioned in this article.
4.3 Recording film
This is one mystery I haven't a clue on. When it comes to live action, like a hockey game or a newscast, how can the entire scene be converted into a 3D bitmap? The only thing I can come up with is through using a special camera with 2 separate lenses and an extremely powerful processor inside that uses the data captured by these 2 lenses and uses parallax to determine the depths and positions of the various objects. The less of a difference there is of an object between the 2 lenses, the further the object is from the scene. Things that otherwise don't move, like the sky and clouds, are assumed to be at the highest Z position possible (the 1919 in the example).
For computer-generated films, however, it's easy enough to get the resulting film. Since, once the producer gives the final go ahead to send it to disk for mass production, the computer processes the scene to produce the entire scene. 3D games benefit in the same way, except that the positions must be determined in real time, an extremely CPU- and/or GPU-demanding task. 1920x1080x1920 games probably won't be seen until 2030.
Consumer-grade volideo camcorders might start around 2030, with volumetric photos ("volutos" if you will) likely coming around 2020.
4.4 Storage file size
With a data rate of 382,205,952,000 bytes per second for film (based on 24 frames per second), today's 3 terabyte (3-trillion-byte) hard drives can only store 7.85 seconds of uncompressed 1920x1080x1920 volumetric video (or "volideo" if you will). Fortunately, since a huge chunk of each frame is fully transparent, the very simple RLE compression algorithm could easily achieve 96% compression for almost any scene. This 25:1 compression ratio means that that same 3 TB hard drive can store 196.22 seconds of volideo, about 2 1/4 minutes. That's still much too short to store a feature length film of 120 minutes.
This is where lossy compression algorithms come in, much like MPEG-2 for video and JPEG for photos. Much of the same algorithms that work in MPEG-2 to compress video well can also work with volideo, though the algorithm has to be changed so that transparent parts are also included. It wouldn't be unusual to achieve a 500:1 or even 3000:1 compression ratio for a decent quality this way, allowing feature length volideos of the 1920x1080x1920 size to easily fit on a 3 TB hard drive. By 2025, however, 3 PB (petabyte, 3000 trillion bytes) hard drives might be available, allowing the storage of a few thousand such volideos to be possible. The 25 GB blu-ray disks will likely be 25 TB optical disks by then (or even data cubes), enough so that a 120-minute 110:1-compressed volideo can easily fit on it.
Although thought of in my book as a way to suit the environment in scene 16, I realized that this is something completely possible within the first part of the 21st century, minus the tremendous computer processing power needed. Visually seeing the depth of the scene would produce amazing visual results that may have a similar effect as radio broadcasts going to television when they first came online. This would likely be the next step after today's 3D television systems. Now, whether or not something like this will get developed, that I don't know.