-
Notifications
You must be signed in to change notification settings - Fork 9
DRAFT: 285 rfm model #422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DRAFT: 285 rfm model #422
Conversation
…bins func and writes txt file with bins, needs testing
…l LLCs from RFM score analysis based on recommendations
…abel list @c-simpsom
…ction which pulls data from tables and creates scores based on tabeled bins
…st of tuples for cleaned and matched data
…fo. Edges_dicts moved into rfm_functions.py. Updated requirements.txt to import modules necessary to run funcs. Data is uploaded to rfm_database once scores are calculated.
Contained secrets.
…peline into 285-rfm-model
…able/renaming var lines 85- 90. date_difference calculation now uses max close donation date instead of query date.
I'm getting this error running create_scores():
with rfm_edges as:
|
@c-simpson looks like a pandas issue with handling Apparently, the bug is fixed in that version as I haven't had this issue. I also ran the script using the API and it seemed okay.... I'll keep digging around on this issue, but my temporary solution is to just update pandas to 1.3.2 |
I'm still getting it with 1.3.2 |
We poked in this while you weren't around and noticed that
adds 43 to the end of recency_bins, which causes the 'monotonic' issue. In the debugger, if I put 43 in the proper sort position, we don't get the monotonic complaint. We didn't know exactly what you were doing so we stopped there. |
@bdeck8317 I created a branch ( 285-sort-after-append ) off yours that does nothing more than sort recency_bins after you append. It runs but only creates scores for 754 matching_ids. So I'll stop messing with things I don't understand and will let you carry on! |
…ed out a bug which was appending improper max values. Now it is robust to max values which are either max for data if it is below max bin edge or will use pre-existing max bin edge
@c-simpson @sposerina ready for review and initial integration. There is one conflict. Not sure how to address it. |
The 'append 43' is making me think about robustness:
|
Spoke to BD (who is swamped) - I'll make the above changes this week. |
Sorry, I missed the mention! I think we discussed briefly, but my thoughts if any of the 3 failures happens:
If one of those fails, what should we do? We don't want to push bad data. @kfettich Any thoughts? -> see above; I don't expect there to be dramatic changes from one cycle to the other, so using outdated RFM scores from the previous cycle should be ok; doing it more than once will be problematic though |
Meet to discuss integration into pipeline. Will need a var passed into
create_scores.py
which is astr(query_date)
Updates the
rfm_scores
table in postgres database.Adds a number of module dependencies to
requirements.txt