Skip to content

Fix std in the summary statistics#360

Merged
guillermo-navas-palencia merged 2 commits into
guillermo-navas-palencia:developfrom
YC-1412:324-fix_std
Jun 3, 2025
Merged

Fix std in the summary statistics#360
guillermo-navas-palencia merged 2 commits into
guillermo-navas-palencia:developfrom
YC-1412:324-fix_std

Conversation

@YC-1412

@YC-1412 YC-1412 commented May 28, 2025

Copy link
Copy Markdown
Contributor

Hi Guillermo,

This PR fixes #324.

When sample_weight is provided, the ssum calculation needs to multiply the weight after $y^2$ instead of ymask (multiply weight first and then take the square). It is related with #323 at some level (both are about sample weight) but needs a separate fix.

$$\begin{align} std &= \frac{\sum w_i (y_i-\bar{y})^2}{\sum w_i} \\\ &= \frac{\sum w_i y_i^2 + \bar{y}^2\sum w_i - 2\bar{y}\sum w_i y_i}{\sum w_i}\\\ &= \frac{\sum w_i y_i^2 + \bar{y}^2\sum w_i - 2\bar{y}^2\sum w_i}{\sum w_i}\\\ &= \frac{\sum w_i y_i^2 - \bar{y}^2\sum w_i}{\sum w_i}\\\ &= \frac{\sum w_i y_i^2}{\sum w_i} - \bar{y}^2\\\ &= \texttt{np.sqrt(s\_ssums / s\_n\_records - mean ** 2)} \end{align}$$

where s_ssums = $\sum w_i y_i^2$.

I used the test code in #324 to test the fix. Row 3 (group [8.5, 9.5)) now has std=0 as expected.

Before
image

After
image

@guillermo-navas-palencia guillermo-navas-palencia added the bug Something isn't working label Jun 3, 2025
@guillermo-navas-palencia guillermo-navas-palencia added this to the v0.21.0 milestone Jun 3, 2025
@guillermo-navas-palencia

Copy link
Copy Markdown
Owner

Very nice work! Thanks for looking at it.

@guillermo-navas-palencia

Copy link
Copy Markdown
Owner

I think this branch must be updated with develop to pass the tests.

@guillermo-navas-palencia guillermo-navas-palencia merged commit 3eb7a34 into guillermo-navas-palencia:develop Jun 3, 2025
12 checks passed
@guillermo-navas-palencia guillermo-navas-palencia mentioned this pull request Oct 26, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants