From 85c2cf24202f2ecb31005bb905db30661a569304 Mon Sep 17 00:00:00 2001 From: STEFANO GRASSI Date: Tue, 27 Aug 2024 14:06:12 +0000 Subject: [PATCH] Replace dsm060-cw2.ipynb --- dsm060-cw2.ipynb | 122 +++++++++++++++++++++++++++++------------------ 1 file changed, 75 insertions(+), 47 deletions(-) diff --git a/dsm060-cw2.ipynb b/dsm060-cw2.ipynb index af153bf..81e29ca 100644 --- a/dsm060-cw2.ipynb +++ b/dsm060-cw2.ipynb @@ -37,9 +37,9 @@ "\n", "The proposed project aims to benefit a wide range of stakeholders in both academic and practical domains.\n", "\n", - "In academia, researchers might benefit from a thorough literature review alongside the additional empirical evidence provided by the project. This could contribute to the fields of Machine Learning and Forecasting by strengthening previous findings and encouraging further research on Conformal Prediction.\n", + "In academia, researchers might benefit from a thorough literature review alongside the additional empirical evidence provided by the project. This could contribute to the fields of Machine Learning and Forecasting by strengthening previous findings and encouraging further research on Conformal Prediction, specifically in the time series domain.\n", "\n", - "In industry, stakeholders across various sectors might gain from additional empirical tests on using Conformal Prediction to estimate more reliable prediction intervals. The existing literature on Conformal Prediction in the context of time series is still nascent and more empirical tests are needed to faciliate further adoptions.\n", + "In industry, stakeholders across various sectors might gain from additional empirical tests on using Conformal Prediction to estimate more reliable prediction intervals. The existing literature on Conformal Prediction in the context of time series is still nascent and more empirical tests are needed to faciliate further adoption.\n", "\n", "Lastly, academic advisors and data providers are crucial stakeholders for the success of this project. Advisors will offer guidance on the project development and Data providers will facilitate access to the datasets necessary for testing and validating the proposed methods.\n", "\n", @@ -82,7 +82,7 @@ "\n", "In addition to these CP-based methods, traditional methods commonly used in probabilistic forecasting will be employed for comparison. These conventional approaches will provide a baseline against which the performance of CP methods can be measured, ensuring a comprehensive evaluation of their effectiveness in uncertainty quantification.\n", "\n", - "The availability of code for all two CP methods ensures they can be rigorously tested against these traditional benchmark models. Each method represents a different approach to uncertainty quantification in time series forecasting, offering valuable insights into their applicability and effectiveness across different scenarios compared to existing probabilistic methods.\n", + "The availability of code for all two CP methods ensures they can be rigorously tested against these traditional benchmark models. Each method represents a different approach to uncertainty quantification in time series forecasting, offering valuable insights into their applicability and effectiveness across different scenarios compared to existing methods.\n", "\n", "## Work plan\n", "\n", @@ -92,8 +92,8 @@ "\n", "#### Introduction Writing\n", "\n", - "* Key Step: Drafting the introduction to outline objectives, background, and significance.\n", - "* Duration: this step is estimated to take few days, depending on how detailed the introduction needs to be. Further \n", + "* Key Step: Drafting the introduction to outline objectives, background and significance.\n", + "* Duration: this step is estimated to take few days, depending on how detailed the introduction needs to be. Further revisions are required to include a quick summary of key findings. \n", "\n", "#### Literature review\n", "\n", @@ -102,18 +102,18 @@ "\n", "#### Data Collection and Assessment\n", "\n", - "Key Step: Gathering relevant data and assessing its suitability for the experimental setup.\n", - "Duration: given that Monash Time Series Archive does include the few days, depending on the availability and accessibility of the data.\n", + "* Key Step: Gathering relevant data from the Monash Archive and assessing its suitability for the experimental setup.\n", + "* Duration: given that the Monash Time Series Archive does include the few days, depending on the availability and accessibility of the data.\n", "\n", "#### Methodology Development\n", "\n", - "Key Step: Developing the research methodology, including techniques and processes.\n", - "Duration: 1 week, as it involves planning and detailing the approach for the experimental phase.\n", + "* Key Step: Developing the research methodology, including techniques and processes.\n", + "* Duration: 1 week, as it involves planning and detailing the approach for the experimental phase.\n", "\n", "#### Validation\n", "\n", - "Key Step: Validating research questions, methodology, and data assessment to ensure readiness for Phase 2.\n", - "Duration: few days, to ensure that the methodology and data are robust and feasible.\n", + "* Key Step: Validating research questions, methodology, and data assessment to ensure readiness for Phase 2.\n", + "* Duration: few days, to ensure that the methodology and data are robust and feasible.\n", "\n", "### Phase 2 - Month 2\n", "\n", @@ -124,13 +124,13 @@ "\n", "#### Model Building and Test\n", "\n", - "Key Step: Developing and testing models based on the validated methodology.\n", - "Duration: 1-2 weeks, as it involves iterative testing and refinement of models.\n", + "* Key Step: Developing and testing models based on the validated methodology.\n", + "* Duration: 1-2 weeks, as it involves iterative testing and refinement of models.\n", "\n", "#### Analysis\n", "\n", - "Key Step: Analyzing results and developing conclusions.\n", - "Duration: 1 week, as it involves interpreting results and synthesizing findings.\n", + "* Key Step: Analyzing results and developing conclusions.\n", + "* Duration: 1 week, as it involves interpreting results and synthesizing findings.\n", "\n", "#### Conclusion Writing\n", "\n", @@ -139,66 +139,85 @@ "\n", "#### Key Milestones for Evaluation\n", "\n", - "End of Week 1 (Phase 1): Completion of Introduction Writing and start of Literature Review.\n", - "Milestone: Draft of the introduction should be complete.\n", + "**End of Week 1 (Phase 1):**\n", + "* **Milestone:** Draft the Introduction.\n", + " * Task: Draft of the introduction and begin the literature review.\n", "\n", - "End of Week 2 (Phase 1): Completion of Literature Review and Gap Analysis.\n", - "Milestone: Identification of research gaps and completion of the literature review.\n", + "**End of Week 2 (Phase 1):**\n", + "* **Milestone:** Completion of Literature review.\n", + " * Task: Identify research gaps and complete the literature review.\n", "\n", - "End of Week 3 (Phase 1): Completion of Data Collection and Assessment, and Methodology Development.\n", - "Milestone: Data should be collected and assessed, and methodology should be developed.\n", + "**End of Week 3 (Phase 1):**\n", + "* **Milestone:** Completion of Data Collection and Methodology Development.\n", + " * Task: Collect and assess data, and develop the methodology.\n", "\n", - "End of Week 4 (Phase 1): Completion of Validation.\n", - "Milestone: Validation of the research questions, methodology, and data assessment.\n", + "**End of Week 4 (Phase 1):**\n", + "* **Milestone:** Validation of Research Framework.\n", + " * Task: Validate the research questions, methodology, and data assessment.\n", "\n", - "End of Week 5 (Phase 2): Completion of Data Processing.\n", - "Milestone: Data should be cleaned and ready for analysis.\n", + "**End of Week 5 (Phase 2):**\n", + "* **Milestone:** Completion of Data Processing.\n", + " * Task: Clean the data and prepare it for analysis.\n", "\n", - "End of Week 6 (Phase 2): Completion of Model Building and Testing.\n", - "Milestone: Initial models should be built and tested.\n", + "**End of Week 6 (Phase 2):**\n", + "* **Milestone:** Completion of model building and testing.\n", + " * Task: Build and test initial models.\n", "\n", - "End of Week 7 (Phase 2): Completion of Analysis.\n", - "Milestone: Results should be analyzed, and initial findings should be developed.\n", + "**End of Week 7 (Phase 2):**\n", + "* **Milestone:** Completion of Analysis.\n", + " * Task: Analyze results and develop initial findings.\n", "\n", - "End of Week 8 (Phase 2): Completion of Conclusion Writing and overall project wrap-up.\n", - "Milestone: Conclusion should be drafted, and the final project report should be prepared\n", + "**End of Week 8 (Phase 2):**\n", + "* **Milestone:** Completion of Conclusion Writing and Project Wrap-Up.\n", + " * Task: Draft the conclusion, complete the final editing, revisions, proofreading, and finalize the project report.\n", "\n", "## Risk assessment\n", "\n", - "Effective risk management is crucial for the success of the proposed project. The risks identified are categorized into two types: project-based risks and non-project-based risks.\n", + "Effective risk management is crucial for the success of the proposed project. Three project-based risks have been identified.\n", "\n", - "### Project-Based Risks\n", - "\n", - "#### Availability of Resources\n", + "### Availability of Resources\n", "\n", "* Risk: Types of might be unavailable or incomplete, and necessary hardware might not meet technical requirements.\n", "* Mitigation: Validation of data acquisition sources was conducted to ensure reliability, thus the choice of Monash Time Series Archive. Hardware requirements have been pre-assessed ensuring they can run on the hardware currently in use which is Mac M1 (2020).\n", "\n", - "#### Scope Creep\n", + "### Scope Creep\n", "\n", "* Risk: The scope of the project could lead to an unmanageable workload such as extensive data preprocessing.\n", "* Mitigation: The project’s aims and objectives will be adjusted to a nicher case if needed without compromising the core research question. This approach maintains focus while ensuring the project remains feasible.\n", "\n", - "#### Feasibility and Complexity\n", + "### Feasibility and Complexity\n", "\n", "* Risk: The complexity of the selected models may exceed current skills, leading to delays.\n", "* Mitigation: Iterative project planning will allow the focus to be narrowed if necessary. For example, focusing on a specific industry could reduce the number of datasets and simplify the models.\n", "\n", - "### Non-Project-Based Risks\n", + "## Expected results\n", + "\n", + "This project leverages approaches from the M5 Uncertainty competition, particularly in quantifying uncertainty in time series forecasting through Conformal Prediction. The focus will be on generating 9 nominal probability levels \n", + "$u \\in \\{0.005, 0.025, 0.165, 0.250, 0.500, 0.750, 0.835, 0.975, 0.995\\}$ to predict median values and construct 4 central prediction intervals (PIs) at confidence levels of $50\\%$, $67\\%$, $95\\%$, and $99\\%$. These outputs are expected to help accurately characterize the distribution's center and tails, offering a comprehensive understanding of forecast uncertainty (Makridakis et al., 2022).\n", + "\n", + "However, it is important to approach the expected results with caution. Time series data is inherently complex, with varying time dependencies, different forecasting horizons, and industry-specific factors, making it challenging to generalize findings. The outcomes of this project should be taken with a grain of salt, recognizing that they may be valid within the specific context of the data used but may not be directly applicable to other datasets or domains without careful consideration.\n", + "\n", + "This does not mean that the project is without value. On the contrary, it serves as a foundation for further exploration and research in the field of Uncertainity Quantification in time series setting. By aligning with key future directions from the M5 competition—such as further developing Machine Learning methods for forecasting and enhancing reproducibility and practical implementation—the project aims to incentivize additional work in this area. Moreover, it seeks to raise awareness among academics and practitioners about the importance of understanding and communicating the full extent of uncertainty in forecasting, particularly in light of unpredictable events like COVID-19, which can lead to significant and unforeseen risks.\n", + "\n", + "## Evaluation\n", "\n", - "#### Personal Emergencies\n", + "The evaluation of this project focuses on the precision of probabilistic forecasts, inspired by the M5 Uncertainty competition. The primary metric used is the Scaled Pinball Loss (SPL), calculated for each time series and quantile. SPL assesses forecast accuracy by penalizing deviations from actual values, using the following formula:\n", "\n", - "* Risk: Unexpected personal events, such as the illness or death of a close relative (grandmother), could disrupt the project timeline.\n", - "* Mitigation: A buffer period has been built into the work plan to accommodate potential delays. Early project initiation and strict adherence to milestones will help maintain progress.\n", + "$$\n", + "\\text{SPL}(u) = \\frac{1}{h} \\sum_{t=n+1}^{n+h} \\left[(Y_t - Q_t(u)) \\cdot u \\cdot \\mathbf{1}\\{Q_t(u) \\leq Y_t\\} + (Q_t(u) - Y_t) \\cdot (1 - u) \\cdot \\mathbf{1}\\{Q_t(u) > Y_t\\}\\right]\n", + "$$\n", "\n", - "#### Health and Performance Issues\n", - "* Risk: Decreased performance due to health-related issues could impact the quality and timeliness of the work.\n", - "* Mitigation: Preventative measures, including a balanced diet and regular physical exercise, will be prioritized to maintain health and performance throughout the project.\n", + "where $Y_t$ is the actual value at time $t$, $Q_t(u)$ is the forecasted quantile, $h$ is the forecasting horizon (which can be 28 days, as suggested by the competition, or adjusted based on the datasets), $n$ is the length of the training sample, and $\\mathbf{1}$ is an indicator function. The denominator of SPL is scaled by the average absolute change in actual values during the training period, ensuring comparability across series with different scales and datasets, making it neutral (Hyndman & Koehler, 2006).\n", "\n", + "The Weighted Scaled Pinball Loss (WSPL) will then aggregate SPL across all the time series analyzed and the nine quantiles to rank the performance of the methods. WSPL is computed as:\n", "\n", - "## Expected result\n", + "$$\n", + "\\text{WSPL} = \\sum_{i=1}^{\\text{n\\_series}} \\frac{1}{\\text{n\\_series}} \\cdot \\frac{1}{9} \\sum_{j=1}^{9} \\text{SPL}(u_j)\n", + "$$\n", "\n", - "## Evaluation" + "where $\\text{n_series}$ is the number of time series analyzed, and each series is weighted equally. This approach differs from the WSPL proposed in the M5 competition, which weights each series based on recent actual sales. While equal weighting might be further refined in future work, it provides an effective starting point.\n", + "\n", + "To evaluate if the aims of the project are achieved, the `ACI` and `EnbPI` methods will be benchmarked, and their WSPL scores compared against traditional methods such as Naive, Seasonal Naive (sNaive), Simple Exponential Smoothing (SES), Exponential Smoothing (ES), AutoRegressive Integrated Moving Average (ARIMA), and Kernel density estimates. Although this choice mirrors the M5 competition and may be subject to debate, the author believes it highlights the advantages of Conformal Prediction, particularly as traditional benchmarks, which are widely used, often assume normally distributed forecast errors, an assumption rarely met in practice." ] }, { @@ -245,8 +264,17 @@ "series forecasting: Two new approaches, arXiv e-prints arXiv:2103.14200.\n", "31. Samya Tajmouati, Bouazza EL Wahbi & Mohamed Dakkon (2024) Applying regression conformal prediction with nearest neighbors to time series data, Communications in Statistics - Simulation and Computation, 53:4, 1768-1778, DOI: 10.1080/03610918.2022.2057538 (Tajmouati et al., 2024).\n", "32. Kelly, M., Longjohn, R., & Nottingham, K., The UCI Machine Learning Repository. Available at: https://archive.ics.uci.edu [Accessed 12 Aug. 2024].\n", - "33. Godahewa, R., Bergmeir, C., Webb, G.I., Hyndman, R.J., & Montero-Manso, P., 2021. Monash Time Series Forecasting Archive. In Neural Information Processing Systems Track on Datasets and Benchmarks." + "33. Godahewa, R., Bergmeir, C., Webb, G.I., Hyndman, R.J., & Montero-Manso, P., 2021. Monash Time Series Forecasting Archive. In Neural Information Processing Systems Track on Datasets and Benchmarks.\n", + "34. Hyndman, R.J. & Koehler, A.B., 2006. Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), pp.679–688. Available at: http://dx.doi.org/10.1016/j.ijforecast.2006.03.001." ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "aa33a4a3-0dfa-4365-a6df-2f366d668908", + "metadata": {}, + "outputs": [], + "source": [] } ], "metadata": { -- GitLab