Got a Big Plan

Tuesday, August 28, 2012

Cheese live from GStreamer Conference 2012

Greetings from San Diego!

Had a good time at the GStreamer Conference, seeing folks from Collabora, chatting, having a few beers, watching talks and some hacking. Good days :)

That's not what I want to talk about. I just want to announce this picture:

It is (likely) the first picture taken with Cheese using GStreamer 1.0 :)

Here's a fast wrap up on the changes:

Camerabin2 is now mostly ported, just a annoying bug left on video recording.
I believe most video effects were already ported some months ago (and they work, see the picture ^)
Very easy to port Cheese to 1.0. Seriously, application's API hasn't changed much.
There will be bugs and there are still some critical asserts being printed, but the hardest part is over now.

The ported version is in a branch at http://cgit.collabora.com/git/user/thiagoss/cheese.git/, I'll get it reviewed by Cheese developers before putting it into the official repository.

That's it, time for the last talk of the conference and then have some fun. o/

Tuesday, January 18, 2011

A renegotiate event for GStreamer

Currently we have a problem at GStreamer that we can only make an element renegotiate using pad buffer allocs. Check the function documentation for understanding how it works.

By using pad buffer allocs, one element can ask downstream if it wants a new caps, but it can't tell upstream to pick a new caps. This would help in dynamic pipelines and applications that do element hot-swapping, which might happen on camerabin(2).

So I started a first attempt at creating a new upstream event to make the pipeline (or a part of it) do a new caps negotiation, trying to pick optimal caps.

I had 2 basic use cases in mind:

videotestsrc ! capsfilter name=cf ! fakesink
videotestsrc ! capsfilter caps="" ! ffmpegcolorspace ! videoscale ! capsfilter name=cf ! fakesink

In both cases, the capsfilter named 'cf' would change its caps property periodically, making the pipeline renegotiate to pick a new compatible and optimal caps.

The resulting patches were really simple and I only modified basetransform and basesrc (other than adding the new event to core). Keep in mind that I'm still experimenting and we should search for regressions that this might cause. Next I'd like to go for a scenario with elements with multiple src/sink pads (demuxer/tee/selector).

The patches are on a gstreamer branch here and the test cases were added on a branch on my -base clone.

Thursday, December 2, 2010

Camerabin2

As some of you might know, there has been some plans for a new camerabin design on the wiki for some time.

Current camerabin uses a single pad to output data for the viewfinder, video recording and image capture. Two problems on this:

* Requires a mix of input/output-selectors and tricky switch handling code to keep buffers on their correct paths.
* Managing different caps on each output type (images/videos/viewfinder) isn't simple.

Those were the main reasons motivating us to rewrite this in a simpler way. So our adventure with camerabin2 begun. Long story short, we already have a prototype on gitorious and it has the minimum basic features: image capture, video recording and a viewfinder.

Here's a summary on the important changing parts, for more details refer to the wiki.

The key change
The big change is to have a source element (from now on called camera source) that has 3 source pads, one for each task: viewfinder, image capture, video recording.

Why doing camerabin2 and not refactoring camerabin?
The short answer is that it's a major design change we're taking here, writing from the ground up is probably safer and faster and won't bother people using current camerabin. Also, as it requires a new source element, we would cause major incompatibility with current sources.

Modules
We are aiming at a more modularized approach this time, so we have a viewfinderbin, a videorecordingbin and imagecapturebin, those are public elements that can be used outside of camerabin2.

The new 3 pad source
Thanks to Rob Clark's work from some time ago (he refactored camerabin into the new design as a proof of concept), we already got a working 3 pad source for testing our prototype. Truth be told, I haven't got really deep into the source internals working, but our goal is to provide a basecamerasrc which will make it easy (or at least, easier) to develop source elements with 3 pads.

Those are the main things I'd like to post here. I'm trying to schedule a meeting with developers interested on using camerabin2 (or that use camerabin) to discuss features, problems, requests and any camerabin2 related topic. Somewhere in the next days would be great. Nokia, Empathy and Cheese developers already showed interest on this. If you do, too, ping me on IRC (thiagoss at #gstreamer at freenode)

[Edited] Forgot to mention that the camerabin2 branch on gitorious already contains an example application under tests/examples/camerabin2.

Tuesday, May 11, 2010

gst-opencv design choices

While continuing wrapping new OpenCV functions into GstElements yesterday, I faced an interesting design choice on the mappings of OpenCV functions' parameters to GstElement's properties.

Take a look at cvSmooth docs. You can see that it has a type parameter, followed by param1, param2, param3 and param4 that have different semantics if different type is used. The question is how to expose those in the 'cvsmooth' GstElement?

I could think of 3 different choices here:

1) Go straightforward and use the same API as OpenCV

As a result, we should have an element with the properties named after the OpenCV parameters:

"cvsmooth type=blur param1=5 param2=3 param3=0.0 param4=0.0"

This results in a very not intuitive API, but we keep it aligned with OpenCV's, making it easy to people that already know one API to use the other one. The element docs would mostly point to OpenCV's docs. Resulting code is simple and easy to maintain.

2) Have multiple elements: cvsmoothblur, cvsmoothgaussian, cvsmooth...

We could have each smooth algorithm (type) into a separate element and have its properties reflect the semantics of this type. For example, we would have cvsmoothblur, cvsmoothmedian and one for each type. The properties of each one would named accordingly to its semantics, instead of some paramX.

This provides a nice API but might increase the number of elements for every function that has this type or a similar parameter. I don't know how common this is. This might be a good solution if there are a few of those. A downside is that switching the type has to use hot-swapping but I don't think this is a common use case.

3) Expose properties for each semantics and use them only if their type is selected.

We still keep it to one element, but we add one property for each semantic a parameter can assume. Those would only be used it we have its corresponding type is selected.

For example: param3 might be the "gaussian standard deviation" or the "color sigma" if type is gaussian or bilateral respectively. We add those 2 properties (standard-deviation and color-sigma) that are only going to be used if their types are selected.

This makes those lines possible:

"cvsmooth type=gaussian standard-deviation=5.0" or

"cvsmooth type=bilateral color-sigma=1.0"

Code is a little messier than options above.

Given those options, I really don't like option 3. I'm considering 1 or 2. From a quick look at some pages of OpenCV's transformations API I could see that this is not very common, and when it happens, only one parameter has a 'variable semantic', looks like I picked the trickiest one as my example.

So, which option would you chose?

Thursday, May 6, 2010

Hacking in gst-opencv

It has been years since I last used OpenCV. We (me and friends working on a lab at the university) used it to process images on batches or to process frames live from a webcam. Things could have been much easier if I knew GStreamer back then. Said so, I decided to take a look at gst-opencv to see what we already can do with it.

There are a few features wrapped as elements at this moment and they work quite well, but it could have a much larger feature set and it seems no one has been recently working on this. Given those and having a little spare time these days, I decided to start hacking on gst-opencv and trying to put it together with the other modules. I'd prefer to have a gst-opencv module, but adding it as a new plugin into gst-plugins-bad is also an option. What do you think?

Current features

[Edited: It seems the videos can only be seen directly on the post at blogspot]

Some nice stuff can already be done with the current elements. Let me show some.

I recorded this video outside some minutes ago:

We can use edgedetect on it and see its edges:

Command: gst-launch uridecodebin uri=youruri ! queue ! ffmpegcolorspace ! edgedetect ! ffmpegcolorspace ! theoraenc ! oggmux ! filesink location=result.ogg

Or we can segment it with pyramidsegment and have a nice effect (some people would enjoy this in PiTiVi?) or use it in machine vision stuff?

Command: gst-launch uridecodebin uri=youruri ! queue ! ffmpegcolorspace ! pyramidsegment ! ffmpegcolorspace ! theoraenc ! oggmux ! filesink location=result.ogg

OpenCV already ships some face detection profiles for you (at Ubuntu, it goes into /usr/share/opencv/haarcascades/), so you can use them with facedetect element, or train your own classifiers to use with it. I stood with the default and tried on some pictures, here are 2 of them:

I think it works pretty well :)

You can disable the circles and just get messages with the faces' positions and do whatever you want with them.

Other than those, there's also 'textwrite', 'templatematch' and 'faceblur' elements.

Current work

I've been working on a simple base class that will make it easier to map simple 1 to 1 OpenCV functions into elements and providing some common properties (like ROI and COI) and GstBuffer-IplImage conversion. This will help covering more functions and should be enough to get me acquainted again to the API, after it I can go for the fancier stuff.

For example, take cvSmooth function, we should only have to write code to map its parameters into properties and a simplified chain function that already works on IplImages instead of GstBuffers.

Repositories

gst-opencv's main repository is at github, I have my personal branches here. From time to time I ping Elleo to upgrade at github, but I hope we can get this upstream in the next weeks.

Monday, November 9, 2009

Trying GStreamer at Windows

Since I've started working with GStreamer I had never tried it out on Windows and tonight I decided to try it out. Edward pointed me to the winbuilds and it took no more than pressing 'next' 4 or 5 times to have default applications (gst-launch, gst-inspect...) and lots of plugins. Easy enough.

That must be the reason it has been some time since I've heard complaints about installing/using GStreamer on Windows. I wonder if there are any other builds out there like these?

Thanks ylatuya!

Wednesday, August 12, 2009

GstCollectPads2 branch

This week I decided to grab the GstCollectPads2 patch (from bug #415754) and start a branch in my freedesktop repository for porting muxers to it. So far we've got:

oggmux
avimux
matroskamux (patch by Mark)
asfmux
qtmux

If you always wanted GStreamer to be able to mux subtitles into your movies, now is the time to provide specs/samples/patches for it. And installing from this branch (I try to keep them up-to-date with current git master) and test it a lot in you favorite applications would help to find regressions from the porting process.

Suggestions to improve GstCollectPads2 or other use cases are also welcome.