08:50 | jnettlet | awesome. newest openembedded builds are incorrectly detecting the cubox as supporting NEON. |
08:50 | jnettle | 08:50 * jnettlet shakes fist at HWCAPS |
10:13 | jnettlet | dv_, do you usually build your OE images with float-abi=softfp or hard? |
10:51 | frilled | heh, rebeeh on video :) |
11:39 | jnettlet | hmmm this program is supposed to be fixed upstream..time to track down the author |
11:42 | jnettlet | oh never mind I see the problem. eglibc applied the patch incorrectly |
12:17 | _rmk_ | jnettlet: any ideas what's going on with this horrid dmaHandle vs Os->baseAddress stuff in gc_hal_kernel_os.c ? |
12:18 | _rmk_ | jnettlet: and is your Os->baseAddress ever non-zero? |
12:23 | jnettlet | _rmk_, never seen baseAddress to not be zero |
12:24 | jnettlet | _rmk_, I believe that is used if you use pmem to allocate the graphics memory |
12:27 | _rmk_ | its probably a good thing that its normally zero |
12:27 | jnettlet | yeah in your version of the driver the reset of the code isn't there yet. In the v4 code that is used to figure out the physical address of the PMEM or ION region |
12:28 | _rmk_ | if you have NO_DMA_COHERENT=0 and baseAddress != 0, then it modifies the stored dmaHandle, which means it violates the DMA API |
12:28 | _rmk_ | because its not passing the same dmaHandle it got from the alloc function back into things like the mmap nor free function |
12:33 | jnettlet | _rmk_, where are you seeing that? I thought that BaseAddress was only used for returning physical addresses. |
12:34 | _rmk_ | there's this nice bit of code: |
12:34 | _rmk_ | if ((Os->baseAddress & 0x80000000) != (mdl->dmaHandle & 0x80000000)) { |
12:34 | _rmk_ | mdl->dmaHandle = (mdl->dmaHandle & ~0x80000000) |
12:34 | _rmk_ | | (Os->baseAddress & 0x80000000); |
12:34 | _rmk_ | } |
12:35 | _rmk_ | and we later do this: |
12:35 | _rmk_ | if (dma_mmap_coherent(Os->device->dev, |
12:35 | _rmk_ | mdlMap->vma, |
12:35 | _rmk_ | mdl->addr, |
12:35 | _rmk_ | mdl->dmaHandle, |
12:35 | _rmk_ | mdl->numPages * PAGE_SIZE) < 0) { |
12:35 | _rmk_ | ... |
12:36 | _rmk_ | and of course later on: |
12:36 | _rmk_ | dma_free_coherent(Os->device->dev, |
12:36 | _rmk_ | mdl->numPages * PAGE_SIZE, |
12:36 | _rmk_ | mdl->addr, |
12:36 | _rmk_ | mdl->dmaHandle); |
12:37 | jnettlet | oh yeah I forgot about that. It is where they are trying to keep the memory alignments needed by the GPU. I fixed that in the userspace driver to not hit that code. |
12:38 | _rmk_ | I think I'm just going to kill off that fiddling with mdl->dmaHandle completely |
12:46 | jnettlet | _rmk_, yeah I have that code wrapped in #if 0 |
13:02 | _rmk_ | I'm not all that thrilled by this either: |
13:02 | _rmk_ | if ((physical >= mdl->dmaHandle) |
13:02 | _rmk_ | && (physical < mdl->dmaHandle + mdl->numPages * PAGE_SIZE) |
13:03 | _rmk_ | what if you have 2G of memory a the top of the 32-bit address space... and you have a mapping right at the end of physical memory... |
13:03 | _rmk_ | mdl->dmaHandle + mdl->numPages * PAGE_SIZE would be zero in that case... |
13:04 | _rmk_ | thankfully, this doesn't arise on the cubox though. |
13:12 | jnettlet | yet |
13:13 | jnettlet | so are you going through and ripping out the NO_DMA_COHERENT code? |
13:14 | _rmk_ | I've already done so |
13:54 | dv_ | jnettlet: whatever the default is |
13:55 | dv_ | jnettlet: the default is hardfp with meta-fsl-arm , and softfp otherwise |
13:56 | jnettlet | dv_, it is okay I got it all sorted out. A patch that needs to be added to eglibc. So now gcc 4.8 with eglibc 2.18 is running with your cubox layer |
13:56 | jnettlet | sorting out iwmmxt optimizations now....finally! |
13:57 | dv_ | \o/ |
13:57 | dv_ | try to push this upstream |
13:58 | dv_ | yocto is in a bugfixing phase, and version 1.5 will be out october 18th . they aim for using gcc 4.8 as the default version. |
13:58 | jnettlet | dv_, I filed a bug in yocto's bug tracker for the eglibc patch. |
13:58 | dv_ | nice |
13:58 | dv_ | did you also make changes to the tune file in my layer? |
13:58 | jnettlet | do you want me to send patches to you for any cubox changes? |
13:58 | dv_ | yeah please |
13:58 | jnettlet | I am still sorting that out. I only want to change the tune parameters for certain packages |
13:59 | dv_ | okay. if gcc 4.8 is added to 1.5, I will then modify the switches in the tune file to use -march=marvellpj4 |
13:59 | dv_ | do I need to specify something extra to activate iwmxxt intrinsics? |
14:00 | jnettlet | yeah for iwmmxt2 you can't specify march it is conflicting. You need to do mcpu=iwmmxt2 and mtune=marvellpj4 |
14:00 | jnettlet | just the weird way the iwmmxt support is implemented |
14:01 | dv_ | oh |
14:01 | dv_ | okay |
14:01 | dv_ | with these , I have iwmmxt and iwmmxt2 intrinsics |
14:01 | dv_ | ? |
14:01 | jnettlet | but I did check and the most recent marvell-pj4 tunings patch did get accepted for gcc 4.8 |
14:02 | jnettlet | yep |
14:02 | dv_ | okay |
14:02 | dv_ | thats nice. it will enable the intrinsics inside ffmpeg/libav. |
14:02 | jnettlet | dv_, well that is another patch. ffmpeg has removed iwmmxt support because it was unmaintained |
14:02 | dv_ | oh. |
14:02 | jnettlet | I need to create a patch that reverts the removal and fixes things |
14:04 | jnettlet | pixman has proper support for iwmmxt although I think it needs special configure options still. Or an environment variable set. I will look at my optimized XO packages. |
14:05 | jnettlet | I still have not heard a word from Marvell on any of my vmeta questions. |
15:19 | jnettlet | is there a uboot version for the cubox that exports the device-tree? |
15:34 | _rmk_ | hmm, I think there's a bug in the galcore guard thread |
15:34 | _rmk_ | Sep 10 10:31:28 cubox kernel: [330]GPU stop at 0x102806c0 for 300 ms, lastWaitLink = 0x102806b0 |
15:34 | _rmk_ | Sep 10 10:31:32 cubox kernel: [330]GPU stop at 0x102806c0 for 300 ms, lastWaitLink = 0x102806b0 |
15:35 | _rmk_ | so, if there's a WAIT instruction at 0x102806b0 and a LINK at 0x102806b8 back to 0x102806b0... what could happen if the GPU reads the instruction at 0x102806b0, increments the IP, reads 0x102806b8, increments the IP to 0x102806c0, and then branches back... |
15:36 | _rmk_ | given the number of these that I'm seeing, I think that there's a race between the instruction reads, the link happening, and reading the GPUs current IP. |
15:37 | _rmk_ | I guess no one sees this because they have the debugging disabled |
15:51 | jnettlet | I have run with with debug enabled and don't remember seeing this. Is it possible this condition now exists because of some of the other races you have fixed? |
15:53 | jnettlet | If you want to dump your changes to your vivante driver somewhere I am sure wumpus and I could help test some of this for you. |
15:55 | wumpus | yes, I'm looking forward to it too |
15:56 | _rmk_ | jnettlet: you need device->powerDebug set true for these messages to appear |
15:57 | jnettlet | _rmk_, oh I haven't set that. We don't use the scaling as for our architectures we just shut the gpu off when we aren't using it. |
15:58 | _rmk_ | yea, but it also gets used for other debug prints too :( |
16:37 | jnettlet | dv_, have ffmpeg patch reverted and fixed up for the most recent git version. Still need to test if it works or not. |