World Cup coverage - how did they do this?

Just watched the BBC coverage of the Spain vs Netherlands game and in the post match analysis, they use a graphic over the studio shot that seems to track the camera zoom and panning so that it looks like the image is floating mid air.

All I can think is they either pre program the camera coordinates and graphic so they match or the graphics system is reading the coordinates live from the camera.

Does anyone know how they actually do it? I'm just interested to know :-)

Here is a short clip I just captured

Re: World Cup coverage - how did they do this?

didikunz wrote:They do live tracking. They use systems that cost six digits numbers, from ORAD or others. Not our world... :)
Ouch that's a lot of money for such a small feature (I can see the benefit for NFL but it's just a fancy feature for a studio)

I was just interest how they did it - As powerful as Caspar is, I didn't want to try and get Caspar to do the same thing! Although thinking about it, it's not impossible!

If flash could skew an image on the fly then is only a case of sending the coordinates in via the XML? :-)

Re: World Cup coverage - how did they do this?

def87 wrote:As soon as CasparCG will get a 3D Layer Option, we only need a "Camera Producer" taking the data from an encoded PTZ head and transforming the layer in 3d space.

Actually really easy and cheap.
If you read the description of piero, you see, that it works without a encoder camera head, just by tracing the lines. This will not be as easy nor cheap...

And by the way: Needs a character generator this functionality? Caspar is developing into a more and more universal system, I think we should stop before it can make rain or snow... :)
Didi Kunz
CasparCG Client-Programmer, Template Maker & Live CG-Operator
Media Support, CH-5722 Gränichen, Switzerland
Problems? Guide to posting Bug reports & Feature requests

Re: World Cup coverage - how did they do this?

Hi all

The BBC are using multiple Augmented Reality solutions in Brazil, and I think there is some confusion here.

The augmented reality that is allowing graphics to be keyed over the studio, showing players on glassy plinths, allowing players to walk in, showing virtual video screens etc. is a technique that the BBC use in their football studio in Salford, and in their presentation studio at Wimbledon. I believe it uses something like a StypeGrip Stype Kit motion capture head that fits to a standard Jimmy Jib, outputting camera position data that can then be rendered by a graphics box and keyed into the jib output. This is a physical motion capture head that is fitted to the jib - there is no analysis of the camera output video (apart possibly during config). The graphics are, I expect, rendered on a box made by a certain Norwegian graphics company. Piero is not used for this type of motion tracking by the BBC AFAIK. You'll notice they only have the AR on the jib - this is because it's the only camera in the studio with the motion capture head.

The augmented reality system that allows EVS clips of football action to have player tracking, ball tracking etc. are done using a system originally developed in-house by BBC R&D and developed by Red Bee (amongst others - the technology is licensable) (called Piero), that uses image analysis techniques to track moving elements in a video signal.

(BBC News used to use - and I believe ITN still use - the BBC-developed Radamec FreeD system for camera tracking VR studios - which used a matrix of bar coded discs in the ceiling and CCTV cameras mounted on the top of each camera, but for their recent local election programme from BBC Elstree they used the MoSys Startracker system : ) There is a bit of a garbage matte issue in the clip I posted - but the great thing was the integration between the real studio and the VR studio - allowing the Moviebird crane to move from the studio to the VR space on-shot with rock solid motion tracking, including reverse shots that let you look at the real studio from within the VR studio, through VR graphic elements with transparency)
Views expressed here are entirely my own, and nothing to do with my employer.

Re: World Cup coverage - how did they do this?

didikunz wrote:And by the way: Needs a character generator this functionality? Caspar is developing into a more and more universal system, I think we should stop before it can make rain or snow... :)
Why not make CarsparCG a more valuable product?
CCG is already using OpenGL for rendering, adding 3d transformations to layers should be an easy task and a great feature.

Of course people will want to render 3d objects next... :-)

Re: World Cup coverage - how did they do this?

We use a 3 axis gimbal on a track, this gimbal is modified with tracking sensors. the Fuji lens already gives out data.
we pack all data in a lan interface and passing it to a Notch computer, which finally controls a D3 Mediaserver.
we use the stype protocol.

gimbals (VIZRT certified) and lan interfaces are from a small swiss company
RemoteCamSystems GmbH
Industriestrasse 15
CH - 8355 Aadorf

Mail Direkt
Mail Office

Re: World Cup coverage - how did they do this?

There is a lot that went into this system. We are not coders, however with the help of some of the coding students at our school, we were able to make it happen.

We ran off the combination of ARToolKit 5.3.2 and CasparCG 2.1.0 Beta 1. There were two computers we used. I am going to assume basic knowledge in both CasparCG and ARToolKit.

The first computer was a tower we built (specs aren't really important, just something decent). We ran Windows 10 with a Decklink SDI. This was the AR computer. In ARTK, we were running a modified version of the NFTBook application, which is included with ARToolKit. We modified it to be fullscreen on a monitor. The other modification was a pink overlay. In the nftSimple application, there was a simple function inside the application's code that allowed the user to use a key to display the controls. We modified this function to remove the text, and then made it fill the entire screen, and made it bright pink. This will be used later for chroma keying.

From there, we trained the NFT application to recognize our football stadium, meaning that it could track items to the field. Please be sure to take the time with this part, as a good training model is important to getting a good track.

We fullscreen that application on a monitor. That then contains the following:

NFTBook Application
Inside of the app is:
1. The video feed from the Decklink on the bottommost level.
2. On top of that, the pink screen, that can be switched on and off for tracking verification.
3. The tracked item (the yellow line).

The monitor that runs that it output via HDMI. It is then split. One goes to the monitor. The other goes to a Decklink Mini Recorder in the second computer.


Enter the second computer.

This computer took in two feeds. Into the Mini Recorder was the HDMI feed from the AR computer. This computer also had a Decklink SDI 4K. This took a clean feed from the camera we were tracking from.

We did:

play 1-1 decklink 1
This creates the Decklink feed on the bottom layer of Caspar, playing the source feed.

Play 1-9 decklink 2
This creates the Decklink feed on layer 9 of Caspar, playing the AR feed.

We then use two route commands, one routing layer 1 to layer 8, and the other routing layer 1 to layer 10. We'll explain why we did this later.

So in summary, we have the second computer playing the following:
Layer 8: Source Feed
Layer 9: AR Feed
Layer 10: Source Feed

We then used the chroma keyer client that is provided here:

That chroma keyer only targets layer 10 in Caspar. So the source feed on layer ten is keyed to select what parts of the field go on top of the line, and what parts the line covers. We key green, so the line appears on the green areas, and none others.

We then just used the chroma keyer built into the official client to key the pink. By keying the pink, we were able to then isolate the line, leaving the proper layering and mixing. The output from that is then taken to the vision mixer.

This system worked well, however it was conditional. First off: it never was used in a broadcast in our studio. All our feeds are shot on NTSC analog, then converted to HDMI on the switcher. Because of this, the resolution going into the tracker was too low, and it could not get a good track. The other issue is with positioning. We were not experienced enough to develop an interface to adjust the line position, so we had two text files open on an additional monitor on the tracker. One was with the preset positions, the other was the file the software read. However, to refresh its position, the application had to be restarted. This made it slow to use for a real time show.

This was a lot, and we spent two years getting it together. It probably doesn't make total sense just from this writing. I am happy to Skype, or otherwise call to talk about this some more.


EDIT: Here is the diagram:
Last edited by jackreynolds on 30 Jan 2018, 03:29, edited 2 times in total.