Home:ALL Converter>Interpreting visual studio profiler, is this subtraction slow? Can I make all this any faster?

Interpreting visual studio profiler, is this subtraction slow? Can I make all this any faster?

Ask Time:2021-09-22T05:20:58         Author:John Katsantas

Json Formatter

I'm using the Visual Studio profiler for the first time and I'm trying to interpret the results. Looking at the percentages on the left, I found this subtraction's time cost a bit strange:

enter image description here

Other parts of the code contain more complex expressions, like:

enter image description here

Even a simple multiplication seems way faster than the subtraction :

enter image description here

Other multiplications take way longer and I really don't get why, like this :

enter image description here

So, I guess my question is if there is anything weird going on here.

Complex expressions take longer than that subtraction and some expressions take way longer than similar other ones. I run the profiler several times and the distribution of the percentages is always like this. Am I just interpreting this wrong?

Update:

I was asked to give the profile for the whole function so here it is, even though it's a bit big. I ran the function inside a for loop for 1 minute and got 50k samples. The function contains a double loop. I include the text first for ease, followed by the pictures of profiling. Note that the code in text is a bit updated.

 for (int i = 0; i < NUMBER_OF_CONTOUR_POINTS; i++) {

    vec4 contourPointV(contour3DPoints[i], 1);
    float phi = angles[i];

    float xW = pose[0][0] * contourPointV.x + pose[1][0] * contourPointV.y + contourPointV.z * pose[2][0] + pose[3][0];
    float yW = pose[0][1] * contourPointV.x + pose[1][1] * contourPointV.y + contourPointV.z * pose[2][1] + pose[3][1];
    float zW = pose[0][2] * contourPointV.x + pose[1][2] * contourPointV.y + contourPointV.z * pose[2][2] + pose[3][2];

    float x = -G_FU_STRICT * xW / zW;
    float y = -G_FV_STRICT * yW / zW;
    x = (x + 1) * G_WIDTHo2;
    y = (y + 1) * G_HEIGHTo2;
    y = G_HEIGHT - y;



    phi -= extraTheta;
    if (phi < 0)phi += CV_PI2;
    int indexForTable = phi * oneKoverPI;
    //vec2 ray(cos(phi), sin(phi));
    vec2 ray(cos_pre[indexForTable], sin_pre[indexForTable]);
    vec2 ray2(-ray.x, -ray.y);
    float outerStepX = ray.x * step;
    float outerStepY = ray.y * step;
    cv::Point2f outerPoint(x + outerStepX, y + outerStepY);
    cv::Point2f innerPoint(x - outerStepX, y - outerStepY);
    cv::Point2f contourPointCV(x, y);
    cv::Point2f contourPointCVcopy(x, y);

    bool cut = false;
    if (!isInView(outerPoint.x, outerPoint.y) || !isInView(innerPoint.x, innerPoint.y)) {
        cut = true;
    }
    bool outside2 = true; bool outside1 = true;

    if (cut) {
        outside2 = myClipLine(contourPointCV.x, contourPointCV.y, outerPoint.x, outerPoint.y, G_WIDTH - 1, G_HEIGHT - 1);
        outside1 = myClipLine(contourPointCVcopy.x, contourPointCVcopy.y, innerPoint.x, innerPoint.y, G_WIDTH - 1, G_HEIGHT - 1);
    }


    myIterator innerRayMine(contourPointCVcopy, innerPoint);
    myIterator outerRayMine(contourPointCV, outerPoint);

    if (!outside1) {
        innerRayMine.end = true;
        innerRayMine.prob = true;
    }
    if (!outside2) {
        outerRayMine.end = true;
        innerRayMine.prob = true;
    }



    vec2 normal = -ray;
    float dfdxTerm = -normal.x;
    float dfdyTerm = normal.y;
    vec3 point3D = vec3(xW, yW, zW);
    cv::Point contourPoint((int)x, (int)y);



    float Xc = point3D.x; float Xc2 = Xc * Xc; float Yc = point3D.y; float Yc2 = Yc * Yc; float Zc = point3D.z; float Zc2 = Zc * Zc;
    float XcYc = Xc * Yc; float dfdxFu = dfdxTerm * G_FU; float dfdyFv = dfdyTerm * G_FU; float overZc2 = 1 / Zc2; float overZc = 1 / Zc;
    pixelJacobi[0] = (dfdyFv * (Yc2 + Zc2) + dfdxFu * XcYc) * overZc2;
    pixelJacobi[1] = (-dfdxFu * (Xc2 + Zc2) - dfdyFv * XcYc) * overZc2;
    pixelJacobi[2] = (-dfdyFv * Xc + dfdxFu * Yc) * overZc;
    pixelJacobi[3] = -dfdxFu * overZc;
    pixelJacobi[4] = -dfdyFv * overZc;
    pixelJacobi[5] = (dfdyFv * Yc + dfdxFu * Xc) * overZc2;


    float commonFirstTermsSum = 0;
    float commonFirstTermsSquaredSum = 0;

    int test = 0;
    while (!innerRayMine.end) {

        test++;
        cv::Point xy = innerRayMine.pos(); innerRayMine++;
        int x = xy.x;
        int y = xy.y;
        float dx = x - contourPoint.x;
        float dy = y - contourPoint.y;
        vec2 dxdy(dx, dy);

        float raw = -glm::dot(dxdy, normal);
        float heavisideTerm = heaviside_pre[(int)raw * 100 + 1000];
        float deltaTerm = delta_pre[(int)raw * 100 + 1000];


        const Vec3b rgb = ante[y * 640 + x];
        int red = rgb[0]; int green = rgb[1]; int blue = rgb[2];
        red = red >> 3; red = red << 10; green = green >> 3; green = green << 5; blue = blue >> 3;
        int colorIndex = red + green + blue;

        pF = pFPointer[colorIndex];
        pB = pBPointer[colorIndex];
        float denAsMul = 1 / (pF + pB + 0.000001);
        pF = pF * denAsMul;

        float pfMinusPb = 2 * pF - 1;
        float denominator = heavisideTerm * (pfMinusPb)+pB + 0.000001;
        float commonFirstTerm = -pfMinusPb / denominator * deltaTerm;

        commonFirstTermsSum += commonFirstTerm;
        commonFirstTermsSquaredSum += commonFirstTerm * commonFirstTerm;

    }
}

enter image description here

enter image description here

enter image description here

enter image description here

enter image description here

Author:John Katsantas,eproduced under the CC 4.0 BY-SA copyright license with a link to the original source and this disclaimer.
Link to original article:https://stackoverflow.com/questions/69275680/interpreting-visual-studio-profiler-is-this-subtraction-slow-can-i-make-all-th
yy