Doina Precup
  • Home
  • Publications
  • Research
  • Students
  • Talks
  • Courses
  • Miscellaneous
Doina Precup
  • Home
  • Publications
  • Research
  • Students
  • Talks
  • Courses
  • Miscellaneous
  • More
    • Home
    • Publications
    • Research
    • Students
    • Talks
    • Courses
    • Miscellaneous

Academic papers

20222021202020192018201720162015201420132012201120102009 2008200720062005200420032002200120001999199819971996

2022

Towards Painless Policy Optimization for Constrained MDPs

A Jain, S Vaswani, R Babanezhad, C Szepesvari, D Precup

arXiv preprint arXiv:2204.05176


Connecting weighted automata, tensor networks and recurrent neural networks through spectral learning

T Li, D Precup, G Rabusseau

Machine Learning, 1-35


COptiDICE: Offline Constrained Reinforcement Learning via Stationary Distribution Correction Estimation

J Lee, C Paduraru, DJ Mankowitz, N Heess, D Precup, KE Kim, A Guez

arXiv preprint arXiv:2204.08957


Attention Option-Critic

R Chunduru, D Precup

arXiv preprint arXiv:2201.02628


Understanding Decision-Time vs. Background Planning in Model-Based Reinforcement Learning

S Alver, D Precup

arXiv preprint arXiv:2206.08442


GP. 2 Deep learning prediction of response to disease modifying therapy in primary progressive multiple sclerosis

JR Falet, J Durso-Finley, B Nichyporuk, J Schroeter, F Bovis, M Sormani, D Precup, T Arbel, DL Arnold

Canadian Journal of Neurological Sciences 49 (s1), S1-S1


ABOVE PROBLEM it doesnt seem to have the pdf online you need to download it off the website


Improving Robustness against Real-World and Worst-Case Distribution Shifts through Decision Region Quantification

Leo Schwinn, Leon Bungert, An Nguyen, René Raab, Falk Pulsmeyer, Doina Precup, Björn Eskofier, Dario Zanca

arXiv preprint arXiv:2205.09619


Deep Learning Prediction of Response to Disease Modifying Therapy in Primary Progressive Multiple Sclerosis (P1-1. Virtual)

Jean-Pierre René Falet, Joshua Durso-Finley, Brennan Nichyporuk, Julien Schroeter, Francesca Bovis, Maria-Pia Sormani, Doina Precup, Tal Arbel, Douglas Arnold

Neurology 98 (18 Supplement)


ABOVE IS A DUPLICATE PAPER AS WELL ANd u cant get the link to a pdf


Learning how to Interact with a Complex Interface using Hierarchical Reinforcement Learning

Gheorghe Comanici, Amelia Glaese, Anita Gergely, Daniel Toyama, Zafarali Ahmed, Tyler Jackson, Philippe Hamel, Doina Precup

arXiv preprint arXiv:2204.10374


Behind the Machine's Gaze: Biologically Constrained Neural Networks Exhibit Human-like Visual Attention

L Schwinn, D Precup, B Eskofier, D Zanca

arXiv preprint arXiv:2204.09093


Deep learning, reinforcement learning, and world models

Yutaka Matsuo, Yann LeCun, Maneesh Sahani, Doina Precup, David Silver, Masashi Sugiyama, Eiji Uchibe, Jun Morimoto

Neural Networks


Don't Freeze Your Embedding: Lessons from Policy Finetuning in Environment Transfer

V Dean, DK Toyama, D Precup

ICLR Workshop on Agent Learning in Open-Endedness


Selective Credit Assignment

V Chelu, D Borsa, D Precup, H van Hasselt

arXiv preprint arXiv:2202.09699


Device-free localization methods within smart indoor environments

N Ghourchian, MA MARTINEZ, D Precup

US Patent App. 17/514,343


Improving Sample Efficiency of Value Based Models Using Attention and Vision Transformers

AA Kalantari, M Amini, S Chandar, D Precup

arXiv preprint arXiv:2202.00710


Why Should I Trust You, Bellman? The Bellman Error is a Poor Replacement for Value Error

S Fujimoto, D Meger, D Precup, O Nachum, SS Gu

arXiv preprint arXiv:2201.12417


The Paradox of Choice: Using Attention in Hierarchical Reinforcement Learning

A Nica, K Khetarpal, D Precup

arXiv preprint arXiv:2201.09653

2021

Reward is enough

D Silver, S Singh, D Precup, RS Sutton

Artificial Intelligence 299, 103535


Gradient starvation: A learning proclivity in neural networks

M Pezeshki, O Kaba, Y Bengio, AC Courville, D Precup, G Lajoie

Advances in Neural Information Processing Systems 34, 1256-1272


Safe option-critic: learning safety in the option-critic architecture

A Jain, K Khetarpal, D Precup

The Knowledge Engineering Review 36


On the expressivity of markov reward

D Abel, W Dabney, A Harutyunyan, MK Ho, M Littman, D Precup, S Singh

Advances in Neural Information Processing Systems 34, 7799-7812


Flow network based generative models for non-iterative diverse candidate generation

E Bengio, M Jain, M Korablyov, D Precup, Y Bengio

Advances in Neural Information Processing Systems 34, 27381-27394


A consciousness-inspired planning agent for model-based reinforcement learning

M Zhao, Z Liu, S Luan, S Zhang, D Precup, Y Bengio

Advances in Neural Information Processing Systems 34, 1569-1581


A survey of exploration methods in reinforcement learning

S Amin, M Gomrokchi, H Satija, H van Hoof, D Precup

arXiv preprint arXiv:2109.00157


Correcting momentum in temporal difference learning

E Bengio, J Pineau, D Precup

arXiv preprint arXiv:2106.03955


Androidenv: a reinforcement learning platform for android

Daniel Toyama, Philippe Hamel, Anita Gergely, Gheorghe Comanici, Amelia Glaese, Zafarali Ahmed, Tyler Jackson, Shibl Mourad, Doina Precup

arXiv preprint arXiv:2105.13231


Self-supervised attention-aware reinforcement learning

H Wu, K Khetarpal, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 35 (12), 10311-10319


ABOVE PROBLEM THE website forces you to download the pdf


Constructing a good behavior basis for transfer using generalized policy updates

S Alver, D Precup

arXiv preprint arXiv:2112.15025


A deep reinforcement learning approach to marginalized importance sampling with the successor representation

S Fujimoto, D Meger, D Precup

International Conference on Machine Learning, 3518-3529


Variance penalized on-policy and off-policy actor-critic

A Jain, G Patil, A Jain, K Khetarpal, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 35 (9), 7899-7907


ABOVE Problem again you have to download the pdf


Training a first-order theorem prover from synthetic data

Vlad Firoiu, Eser Aygun, Ankit Anand, Zafarali Ahmed, Xavier Glorot, Laurent Orseau, Lei Zhang, Doina Precup, Shibl Mourad

arXiv preprint arXiv:2103.03798


Temporal abstraction in reinforcement learning with the successor representation

MC Machado, A Barreto, D Precup

arXiv preprint arXiv:2110.05740


Improving long-term metrics in recommendation systems using Short-Horizon Reinforcement Learning

B Mazoure, P Mineiro, P Srinath, RS Sedeh, D Precup, A Swaminathan

arXiv preprint arXiv:2106.00589


Single-shot pruning for offline reinforcement learning

SY Arnob, R Ohib, S Plis, D Precup

arXiv preprint arXiv:2112.15579


Flexible Option Learning

M Klissarov, D Precup

Advances in Neural Information Processing Systems 34, 4632-4646


Temporally abstract partial models

K Khetarpal, Z Ahmed, G Comanici, D Precup

Advances in Neural Information Processing Systems 34, 1979-1991


Policy gradients incorporating the future

D Venuto, E Lau, D Precup, O Nachum

arXiv preprint arXiv:2108.02096


Optimal Spectral-Norm Approximate Minimization of Weighted Finite Automata

B Balle, C Lacroce, P Panangaden, D Precup, G Rabusseau

arXiv preprint arXiv:2102.06860


Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning

M Gomrokchi, S Amin, H Aboutalebi, A Wong, D Precup

arXiv preprint arXiv:2109.03975


Preferential Temporal Difference Learning

N Anand, D Precup

arXiv preprint arXiv:2106.06508


Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning

SY Arnob, R Islam, D Precup

arXiv preprint arXiv:2112.15578


Device-free localization methods within smart indoor environments

N Ghourchian, MA MARTINEZ, D Precup

US Patent 11,212,650


ABOVE PROBLEM this is the same as something in 2022


Proving Theorems using Incremental Learning and Hindsight Experience Replay

Eser Aygün, Laurent Orseau, Ankit Anand, Xavier Glorot, Vlad Firoiu, Lei M Zhang, Doina Precup, Shibl Mourad

arXiv preprint arXiv:2112.10664


Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning

S Yeasar Arnob, R Islam, D Precup

arXiv e-prints, arXiv: 2112.15578


ABOVE is the same as something else in 2021


Why Should I Trust You, Bellman? Evaluating the Bellman Objective with Off-Policy Data

S Fujimoto, D Meger, D Precup, O Nachum, SS Gu


Is Heterophily A Real Nightmare For Graph Neural Networks on Performing Node Classification?

S Luan, C Hua, Q Lu, J Zhu, M Zhao, S Zhang, XW Chang, D Precup


Improving Long-Term Metrics in Recommendation Systems using Short-Horizon Reinforcement Learning

B Mazoure, P Mineiro, P Srinath, RS Sedeh, D Precup, A Swaminathan

arXiv preprint arXiv:2106.00589


ABOVE is the same as another one but i made thme both link to the same thing (one says offline rl one says reinforcement learning)


What is Going on Inside Recurrent Meta Reinforcement Learning Agents?

S Alver, D Precup

arXiv preprint arXiv:2104.14644


Estimating treatment effect for individuals with progressive multiple sclerosis using deep learning

Jean-Pierre R Falet, Joshua Durso-Finley, Brennan Nichyporuk, Julien Schroeter, Francesca Bovis, Maria-Pia Sormani, Doina Precup, Tal Arbel, Douglas Lorne Arnold

medRxiv

2020

Exploring uncertainty measures in deep networks for multiple sclerosis lesion detection and segmentation

T Nair, D Precup, DL Arnold, T Arbel

Medical image analysis 59, 101557


Fast reinforcement learning with generalized policy updates

A Barreto, S Hou, D Borsa, D Silver, D Precup

Proceedings of the National Academy of Sciences 117 (48), 30079-30087


Towards continual reinforcement learning: A review and perspectives

K Khetarpal, M Riemer, I Rish, D Precup

arXiv preprint arXiv:2012.13490


Invariant causal prediction for block mdps

Amy Zhang, Clare Lyle, Shagun Sodhani, Angelos Filos, Marta Kwiatkowska, Joelle Pineau, Yarin Gal, Doina Precup

International Conference on Machine Learning, 11214-11224


What can I do here? A Theory of Affordances in Reinforcement Learning

K Khetarpal, Z Ahmed, G Comanici, D Abel, D Precup

International Conference on Machine Learning, 5243-5253


Interference and generalization in temporal difference learning

E Bengio, J Pineau, D Precup

International Conference on Machine Learning, 767-777


Algorithmic improvements for deep reinforcement learning applied to interactive fiction

V Jain, W Fedus, H Larochelle, D Precup, MG Bellemare

Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 4328-4336


ABOVE PROBLEM i cant link a PDF only download it


Value preserving state-action abstractions

D Abel, N Umbanhowar, K Khetarpal, D Arumugam, D Precup, M Littman

International Conference on Artificial Intelligence and Statistics, 1639-1650


Options of interest: Temporal abstraction with interest functions

K Khetarpal, M Klissarov, M Chevalier-Boisvert, PL Bacon, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 34 (04), 4444-4451


ABOVE PROBLEM cant link a pdf again


Gifting in multi-agent reinforcement learning

A Lupu, D Precup

Proceedings of the 19th International Conference on Autonomous Agents and MultiAgent Systems


Assessment of extubation readiness using spontaneous breathing trials in extremely preterm neonates

W Shalish, L Kanbar, L Kovacs, S Chawla, M Keszler, S Rao, Samantha Latremouille, Doina Precup, Karen Brown, Robert E Kearney, Guilherme M Sant’Anna

JAMA pediatrics 174 (2), 178-185


ABOVE PROBLEM can only download the PDF


An equivalence between loss functions and non-uniform sampling in experience replay

S Fujimoto, D Meger, D Precup

Advances in neural information processing systems 33, 14219-14230


Learning to cooperate: Emergent communication in multi-agent navigation

I Kajić, E Aygün, D Precup

arXiv preprint arXiv:2004.01097


On efficiency in hierarchical reinforcement learning

Z Wen, D Precup, M Ibrahimi, A Barreto, B Van Roy, S Singh

Advances in Neural Information Processing Systems 33, 6708-6718


Policy evaluation networks

J Harb, T Schaul, D Precup, PL Bacon

arXiv preprint arXiv:2002.11833


Forethought and hindsight in credit assignment

V Chelu, D Precup, HP van Hasselt

Advances in Neural Information Processing Systems 33, 2270-2281


Learning to prove from synthetic theorems

Eser Aygün, Zafarali Ahmed, Ankit Anand, Vlad Firoiu, Xavier Glorot, Laurent Orseau, Doina Precup, Shibl Mourad

arXiv preprint arXiv:2006.11259


Navigation agents for the visually impaired: A sidewalk simulator and experiments

Martin Weiss, Simon Chamorro, Roger Girgis, Margaux Luck, Samira E Kahou, Joseph P Cohen, Derek Nowrouzezahrai, Doina Precup, Florian Golemo, Chris Pal

Conference on Robot Learning, 1314-1327


Value-driven hindsight modelling

Arthur Guez, Fabio Viola, Théophane Weber, Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess

Advances in Neural Information Processing Systems 33, 12499-12509


Reward propagation using graph convolutional networks

M Klissarov, D Precup

Advances in Neural Information Processing Systems 33, 12895-12908


A distributional analysis of sampling-based reinforcement learning algorithms

P Amortila, D Precup, P Panangaden, MG Bellemare

International Conference on Artificial Intelligence and Statistics, 4357-4366


Locally persistent exploration in continuous control tasks with sparse rewards

S Amin, M Gomrokchi, H Aboutalebi, H Satija, D Precup

arXiv preprint arXiv:2012.13658


Diversity-Enriched Option-Critic

A Kamat, D Precup

arXiv preprint arXiv:2011.02565


Complete the missing half: Augmenting aggregation filtering with diversification for graph convolutional networks

S Luan, M Zhao, C Hua, XW Chang, D Precup

arXiv preprint arXiv:2008.08844


A brief look at generalization in visual meta-reinforcement learning

S Alver, D Precup

arXiv preprint arXiv:2006.07262


Phylogenetic Manifold Regularization: A semi-supervised approach to predict transcription factor binding sites

F Ahsan, A Drouin, F Laviolette, D Precup, M Blanchette

2020 IEEE International Conference on Bioinformatics and Biomedicine (BIBM)


ABOVE PROBLEM CANT ACEss PDF


A fully tensorized recurrent neural network

CC Onu, JE Miller, D Precup

arXiv preprint arXiv:2010.04196


Training matters: Unlocking potentials of deeper graph convolutional neural networks

S Luan, M Zhao, XW Chang, D Precup

arXiv preprint arXiv:2008.08838


Efficient planning under partial observability with unnormalized q functions and spectral learning

T Li, B Mazoure, D Precup, G Rabusseau

International Conference on Artificial Intelligence and Statistics, 2852-2862


Reward redistribution mechanisms in multi-agent reinforcement learning

A Ibrahim, A Jitani, D Piracha, D Precup

Adaptive Learning Agents Workshop at the International Conference on Autonomous Agents and Multiagent Systems


Provably efficient reconstruction of policy networks

Bogdan Mazoure, Thang Doan, Tianyu Li, Vladimir Makarenkov, Joelle Pineau, Doina Precup, Guillaume Rabusseau

arXiv:2002.02863


ABOVE PROBLEM seems like you can only download pdf from the site


A Study of Policy Gradient on a Class of Exactly Solvable Models

Gavin McCracken, Colin Daniels, Rosie Zhao, Anna Brandenberger, Prakash Panangaden, Doina Precup

arXiv preprint arXiv:2011.01859


Conditional Networks

Anthony Ortiz, Kris Sankaran, Olac Fuentes, Christopher Kiekintveld, Pascal Vincent, Yoshua Bengio, Doina Precup

ICLR 2021


Offline Policy Optimization with Variance Regularization

Riashat Islam, Samarth Sinha, Homanga Bharadhwaj, Samin Yeasar Arnob, Zhuoran Yang, Zhaoran Wang, Animesh Garg, Lihong Li, Doina Precup

ICLR 2021


Practical Marginalized Importance Sampling with the Successor Representation

S Fujimoto, D Meger, D Precup

ICLR 2021


Device free localization methods within smart indoor environments

N Ghourchian, MA MARTINEZ, D Precup

US Patent 10,779,127


ABOVE PROBLEM this is the sam as a patent from 2022 and 2021?


Keynote Lecture: Building Knowledge For AI AgentsWith Reinforcement Learning

D Precup

2020 IEEE 16th International Conference on Intelligent Computer Communication and Processing (ICCP)


ABOVE PROBLEM cant get the pdf


Representation of Reinforcement Learning Policies in Reproducing Kernel Hilbert Spaces

Bogdan Mazoure, Thang Doan, Tianyu Li, Vladimir Makarenkov, Joelle Pineau, Doina Precup, Guillaume Rabusseau

arXiv preprint arXiv:2002.02863


Exploring Bayesian Deep Learning Uncertainty Measures for Segmentation of New Lesions in Longitudinal MRIs

NM Sepahvand, R Mehta, DL Arnold, D Precup, T Arbel

MIDL 2020

2019

Off-policy deep reinforcement learning without exploration

S Fujimoto, D Meger, D Precup

International conference on machine learning, 2052-2062


Break the ceiling: Stronger multi-scale deep graph convolutional networks

S Luan, M Zhao, XW Chang, D Precup

Advances in neural information processing systems 32


Combined reinforcement learning via abstract representations

V François-Lavet, Y Bengio, D Precup, J Pineau

Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 3582-3589


ABOVE PROBLEM can only download the PDF from the site


Multiple kernel learning-based transfer regression for electric load forecasting

D Wu, B Wang, D Precup, B Boulet

IEEE Transactions on Smart Grid 11 (2), 1183-1192


ABOVE PROBLEM cant get PDF from the site


Hindsight credit assignment

Anna Harutyunyan, Will Dabney, Thomas Mesnard, Mohammad Gheshlaghi Azar, Bilal Piot, Nicolas Heess, Hado P van Hasselt, Gregory Wayne, Satinder Singh, Doina Precup, Remi Munos

Advances in neural information processing systems 32


The termination critic

A Harutyunyan, W Dabney, D Borsa, N Heess, R Munos, D Precup

arXiv preprint arXiv:1902.09996


The option keyboard: Combining skills in reinforcement learning

André Barreto, Diana Borsa, Shaobo Hou, Gheorghe Comanici, Eser Aygün, Philippe Hamel, Daniel Toyama, Shibl Mourad, David Silver, Doina Precup

Advances in Neural Information Processing Systems 32


Prediction of disease progression in multiple sclerosis patients using deep learning analysis of MRI data

A Tousignant, P Lemaître, D Precup, DL Arnold, T Arbel

International conference on medical imaging with deep learning, 483-492


Connecting weighted automata and recurrent neural networks through spectral learning

G Rabusseau, T Li, D Precup

The 22nd International Conference on Artificial Intelligence and Statistics


The impact of time interval between extubation and reintubation on death or bronchopulmonary dysplasia in extremely preterm infants

Wissam Shalish, Lara Kanbar, Lajos Kovacs, Sanjay Chawla, Martin Keszler, Smita Rao, Bogdan Panaitescu, Alyse Laliberte, Doina Precup, Karen Brown, Robert E Kearney, Guilherme M Sant'Anna

The Journal of pediatrics 205, 70-76. e2


ABOVE PROBLEM cant get the pdf from the site


Neural transfer learning for cry-based diagnosis of perinatal asphyxia

CC Onu, J Lebensold, WL Hamilton, D Precup

arXiv preprint arXiv:1906.10199


Self-supervised learning of distance functions for goal-conditioned reinforcement learning

S Venkattaramanujam, E Crawford, T Doan, D Precup

arXiv preprint arXiv:1907.02998


Uncertainty aware learning from demonstrations in multiple contexts using bayesian neural networks

S Thakur, H van Hoof, JCG Higuera, D Precup, D Meger

2019 International Conference on Robotics and Automation (ICRA), 768-774


Shaping representations through communication: community size effect in artificial learning systems

O Tieleman, A Lazaridou, S Mourad, C Blundell, D Precup

arXiv preprint arXiv:1912.06208


Option-critic in cooperative multi-agent systems

Jhelum Chakravorty, Nadeem Ward, Julien Roy, Maxime Chevalier-Boisvert, Sumana Basu, Andrei Lupu, Doina Precup

arXiv preprint arXiv:1911.12825


Early prediction of alzheimer’s disease progression using variational autoencoders

S Basu, K Wagstyl, A Zandifar, L Collins, A Romero, D Precup

International Conference on Medical Image Computing and Computer-Assisted Intervention


ABOVE PROBLEM cant get the PDF


Improving pathological structure segmentation via transfer learning across diseases

Barleen Kaur, Paul Lemaître, Raghav Mehta, Nazanin Mohammadi Sepahvand, Doina Precup, Douglas Arnold, Tal Arbel

Domain Adaptation and Representation Transfer and Medical Image Learning with Less Labels and Imperfect Data


SVRG for policy evaluation with fewer gradient evaluations

Z Peng, A Touati, P Vincent, D Precup

arXiv preprint arXiv:1906.03704


Variational state encoding as intrinsic motivation in reinforcement learning

M Klissarov, R Islam, K Khetarpal, D Precup

Task-Agnostic Reinforcement Learning Workshop at Proceedings of the International Conference on Learning Representations


Marginalized state distribution entropy regularization in policy optimization

R Islam, Z Ahmed, D Precup

arXiv preprint arXiv:1912.05128


Singular value automata and approximate minimization

B Balle, P Panangaden, D Precup

Mathematical Structures in Computer Science 29 (9), 1444-1478


An empirical study of batch normalization and group normalization in conditional computation

V Michalski, V Voleti, SE Kahou, A Ortiz, P Vincent, C Pal, D Precup

arXiv preprint arXiv:1908.00061


Learning options with interest functions

K Khetarpal, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 9955-9956


ABOVE PROBLEM cant get the pdf only downlad it


Per-decision option discounting

A Harutyunyan, P Vrancx, P Hamel, A Nowe, D Precup

International Conference on Machine Learning, 2644-2652


Learning representations of logical formulae using graph neural networks

X Glorot, A Anand, E Aygun, S Mourad, P Kohli, D Precup

Neural Information Processing Systems, Workshop on Graph Representation Learning


Entropy regularization with discounted future state distribution in policy gradient methods

R Islam, R Seraj, PL Bacon, D Precup

arXiv preprint arXiv:1912.05104


Augmenting learning using symmetry in a biologically-inspired domain

S Mishra, A Abdolmaleki, A Guez, P Trochim, D Precup

arXiv preprint arXiv:1910.00528


Leveraging observations in bandits: Between risks and benefits

A Lupu, A Durand, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 33 (01), 6112-6119


ABOVE PROBLEM can only download pdf


Recurrent value functions

P Thodoroff, N Anand, L Caccia, D Precup, J Pineau

arXiv preprint arXiv:1905.09562


Learning modular safe policies in the bandit setting with application to adaptive clinical trials

H Aboutalebi, D Precup, T Schuster

arXiv preprint arXiv:1903.01026


Temporally extended metrics for markov decision processes

P Amortila, MG Bellemare, P Panangaden, D Precup

SafeAI@ AAAI


Avoidance Learning Using Observational Reinforcement Learning

David Venuto, Léonard Boussioux, Junhao Wang, Rola Dali, Jhelum Chakravorty, Yoshua Bengio, Doina Precup

arXiv preprint arXiv:1909.11228


Building knowledge for ai agents with reinforcement learning

D Precup

Proceedings of the 18th International Conference on Autonomous Agents and …


ABOVE PROBLEM this is the same as a paper from 2020


META-Learning State-based Eligibility Traces for More Sample-Efficient Policy Evaluation

M Zhao, S Luan, I Porada, XW Chang, D Precup

arXiv preprint arXiv:1904.11439


Learning proposals for sequential importance samplers using reinforced variational inference

Z Ahmed, A Karuvally, D Precup, S Gravel

ICLR 2019


Automatic Curriculum Generation via Task Perturbations in Reinforcement Learning

S Venkattaramanujam, R Islam, D Precup

TARL workshop, ICLR 2019


Graph convolutional networks as reward shaping functions

M Klissarov, D Precup

ICLR Workshop on Representation Learning on Graphs and Manifolds


Doubly Robust Off-Policy Actor-Critic Algorithms for Reinforcement Learning

R Islam, R Seraj, SY Arnob, D Precup

arXiv preprint arXiv:1912.05109


Assessing Generalization in TD methods for Deep Reinforcement Learning

E Bengio, D Precup, J Pineau

ICLR 2020


Revisit Policy Optimization in Matrix Form

S Luan, XW Chang, D Precup

arXiv preprint arXiv:1909.09186


Community size effect in artificial learning systems.

O Tieleman, A Lazaridou, S Mourad, C Blundell, D Precup

ViGIL@ NeurIPS


ABOVE PROBLEM this seems very similar to a paper form the same year


Learning Reliable Policies in the Bandit Setting with Application to Adaptive Clinical Trials.

H Aboutalebi, D Precup, T Schuster

KHD@ IJCAI, 43-49

2018

Deep reinforcement learning that matters

P Henderson, R Islam, P Bachman, J Pineau, D Precup, D Meger

Proceedings of the AAAI conference on artificial intelligence 32 (1)


ABOVE PROBLEM can only download the PDF from the site


When waiting is not an option: Learning options with a deliberation cost

J Harb, PL Bacon, M Klissarov, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 32 (1)


ABOVE PROBLEM can only download the PDF from the site


Optiongan: Learning joint reward-policy options using generative adversarial inverse reinforcement learning

P Henderson, WD Chang, PL Bacon, D Meger, J Pineau, D Precup

Proceedings of the AAAI conference on artificial intelligence 32 (1)


ABOVE PROBLEM can only download the PDF from the site


Disentangling the independently controllable factors of variation by interacting with the world

Valentin Thomas, Emmanuel Bengio, William Fedus, Jules Pondard, Philippe Beaudoin, Hugo Larochelle, Joelle Pineau, Doina Precup, Yoshua Bengio

arXiv preprint arXiv:1802.09484


Learning robust options

D Mankowitz, T Mann, PL Bacon, D Precup, S Mannor

Proceedings of the AAAI Conference on Artificial Intelligence 32 (1)


ABOVE PROBLEM can only download the PDF from the site


Resolving event coreference with supervised representation learning and clustering-oriented regularization

K Kenyon-Dean, JCK Cheung, D Precup

arXiv preprint arXiv:1805.10985


Convergent TREE BACKUP and RETRACE with function approximation

A Touati, PL Bacon, D Precup, P Vincent

International Conference on Machine Learning, 4955-4964


Patterns of reintubation in extremely preterm infants: a longitudinal cohort study

Wissam Shalish, Lara Kanbar, Martin Keszler, Sanjay Chawla, Lajos Kovacs, Smita Rao, Bogdan A Panaitescu, Alyse Laliberte, Doina Precup, Karen Brown, Robert E Kearney, Guilherme M Sant'Anna

Pediatric research 83 (5), 969-975


Learning safe policies with expert guidance

J Huang, F Wu, D Precup, Y Cai

Advances in Neural Information Processing Systems 31


Optimizing home energy management and electric vehicle charging with reinforcement learning

D Wu, G Rabusseau, V François-lavet, D Precup, B Boulet

Proceedings of the 16th Adaptive Learning Agents


Temporal regularization for markov decision process

P Thodoroff, A Durand, J Pineau, D Precup

Advances in Neural Information Processing Systems 31


Learning with options that terminate off-policy

A Harutyunyan, P Vrancx, PL Bacon, D Precup, A Nowe

Proceedings of the AAAI Conference on Artificial Intelligence 32 (1)


ABOVE PROBLEM can only downwload the pdf from the site


Clustering-oriented representation learning with attractive-repulsive loss

Kian Kenyon-Dean, Andre Cianflone, Lucas Page-Caccia, Guillaume Rabusseau, Jackie Chi Kit Cheung, Doina Precup

arXiv preprint arXiv:1812.07627


Eligibility traces for options

A Jain

McGill University (Canada)


ABOVE PROBLEM NOT IN AUTHORS LIST

Knowledge representation for reinforcement learning using general value functions

Gheorghe Comanici, Doina Precup, Andre Barreto, Daniel Kenji Toyama, Eser Aygün, Philippe Hamel, Sasha Vezhnevets, Shaobo Hou, Shibl Mourad

ICLR 2019


Shaping representations through communication

O Tieleman, A Lazaridou, S Mourad, C Blundell, D Precup

ICLR 2019


ABOVE PROBLEM seems similar to a 2019 paper


Undersampling and bagging of decision trees in the analysis of cardiorespiratory behavior for the prediction of extubation readiness in extremely preterm infants

Lara J Kanbar, Charles C Onu, Wissam Shalish, Karen A Brown, Guilherme M Sant'Anna, Doina Precup, Robert E Kearney

2018 40th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)


Environments for lifelong reinforcement learning

K Khetarpal, S Sodhani, S Chandar, D Precup

arXiv preprint arXiv:1811.10732


Dyna planning using a feature based generative model

R Faulkner, D Precup

arXiv preprint arXiv:1805.10129


The barbados 2018 list of open issues in continual learning

Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc Bellemare, Doina Precup

arXiv preprint arXiv:1811.07004


Attend before you act: Leveraging human visual attention for continual learning

K Khetarpal, D Precup

arXiv preprint arXiv:1807.09664


Nonlinear weighted finite automata

T Li, G Rabusseau, D Precup

International Conference on Artificial Intelligence and Statistics, 679-688


Constructing temporal abstractions autonomously in reinforcement learning

PL Bacon, D Precup

Ai Magazine 39 (1), 39-50


ABOVE PROBLEM cna only download the PDF


Temporal abstraction

D Precup, C Paduraru, A Koop, RS Sutton, S Singh

URL: http://videolectures. net/site/normal_dl/tag 1199094


ABOVE PROBLEM cna only download the PDF


A neural network based nonlinear weighted finite automata

T Li

McGill University (Canada)


ABOVE PROBLEM not on authors list and this paper is similar to another 2018 one


Diffusion-Based Approximate Value Functions

M Klissarov, D Precup

ICML 2018


Learning predictive state representations from non-uniform sampling

Y Grinberg, H Aboutalebi, M Lyman-Abramovitch, B Balle, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 32 (1)


ABOVE PROBLEM cna only download the PDF


Prediction of Progression in Multiple Sclerosis Patients

A Tousignant, P Lemaître, D Precup, D Arnold, T Arbel

International Conference on Medical Imaging with Deep Learning--Full Paper Track


Where Off-Policy Deep Reinforcement Learning Fails

S Fujimoto, D Meger, D Precup

ICLR 2019


Leveraging Observational Learning for Exploration in Bandits

A Lupu, A Durand, D Precup

Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems


Imitation upper confidence bound for bandits on a graph

A Lupu, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 32 (1)


ABOVE PROBLEM cna only download the PDF

2017

The option-critic architecture

PL Bacon, J Harb, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 31 (1)


ABOVE PROBLEM cna only download the PDF


Reproducibility of benchmarked deep reinforcement learning tasks for continuous control

R Islam, P Henderson, M Gomrokchi, D Precup

arXiv preprint arXiv:1708.04133


Independently controllable features

Valentin Thomas, Jules Pondard, Emmanuel Bengio, Marc Sarfati, Philippe Beaudoin, Marie-Jean Meurs, Joelle Pineau, Doina Precup, Yoshua Bengio

arXiv preprint arXiv:1708.01289


Real-time indoor localization in smart homes using semi-supervised learning

N Ghourchian, M Allegue-Martinez, D Precup

Twenty-Ninth IAAI Conference


ABOVE PROBLEM Cant get the PDF (the website doesnt work)


Learnings options end-to-end for continuous action tasks

M Klissarov, PL Bacon, J Harb, D Precup

arXiv preprint arXiv:1712.00004


Independently controllable features

E Bengio, V Thomas, J Pineau, D Precup, Y Bengio

arXiv preprint arXiv:1703.07718


ABOVE PROBLEM this is the same as another paper but it has less authors


Independently controllable factors

V Thomas, J Pondard, E Bengio, M Sarfati, P Beaudoin, MJ Meurs, ...

arXiv preprint arXiv:1708.01289


ABOVE PROBLEM is this the smae as the other two independatly controllable factors/features?


Prediction of extubation readiness in extremely preterm infants by the automated analysis of cardiorespiratory behavior: study protocol

Wissam Shalish, Lara J Kanbar, Smita Rao, Carlos A Robles-Rubio, Lajos Kovacs, Sanjay Chawla, Martin Keszler, Doina Precup, Karen Brown, Robert E Kearney, Guilherme M Sant’Anna

BMC pediatrics 17 (1), 1-15


ABOVE PROBLEM can only download the PDFalso this is v ery similar to another 2017 paper


World knowledge for reading comprehension: Rare entity prediction with hierarchical lstms using external descriptions

T Long, E Bengio, R Lowe, JCK Cheung, D Precup

Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing


ABOVE PROBLEM can only download the PDF


Boosting based multiple kernel learning and transfer regression for electricity load forecasting

D Wu, B Wang, D Precup, B Boulet

Joint European Conference on Machine Learning and Knowledge Discovery in Databases


Variational generative stochastic networks with collaborative shaping

P Bachman, D Precup

arXiv preprint arXiv:1708.00805


Investigating recurrence and eligibility traces in deep Q-networks

J Harb, D Precup

arXiv preprint arXiv:1704.05495


Ubenwa: Cry-based diagnosis of birth asphyxia

Charles C Onu, Innocent Udeogu, Eyenimi Ndiomu, Urbain Kengni, Doina Precup, Guilherme M Sant'Anna, Edward Alikor, Peace Opara

arXiv preprint arXiv:1711.06405


Predicting future disease activity and treatment responders for multiple sclerosis patients using a bag-of-lesions brain representation

A Doyle, D Precup, DL Arnold, T Arbel

nternational Conference on Medical Image Computing and Computer-Assisted Intervention


ABOVE PROBLEM can only download the PDF


Proceedings of the 34th International Conference on Machine Learning-Volume 70

D Precup, YW Teh

JMLR. org


ABOVE PROBLEM idk where to even find the pdf on the site


Predicting extubation readiness in extreme preterm infants based on patterns of breathing

Charles C Onu, Lara J Kanbar, Wissam Shalish, Karen A Brown, Guilherme M Sant'Anna, Robert E Kearney, Doina Precup

2017 IEEE Symposium Series on Computational Intelligence (SSCI), 1-7


ABOVE seems very similar to another 2017 paper


A semi-Markov chain approach to modeling respiratory patterns prior to extubation in preterm infants

Charles C Onu, Lara J Kanbar, Wissam Shalish, Karen A Brown, Guilherme M Sant'Anna, Robert E Kearney, Doina Precup

2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)


Horizontal and vertical self-adaptive cloud controller with reward optimization for resource allocation

JAC Cabré, D Precup, R Sanz

2017 International Conference on Cloud and Autonomic Computing (ICCAC), 184-185


ABOVE PROBLEM can only download the PDF


Multi-Timescale, Gradient Descent, Temporal Difference Learning with Linear Options

P Kumar, D Precup

arXiv preprint arXiv:1703.06471


APEX_SCOPE: A graphical user interface for visualization of multi-modal data in inter-disciplinary studies

LJ Kanbar, W Shalish, D Precup, K Brown, GM Sant'Anna, RE Kearney

2017 39th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)


ABOVE PROBLEM caqnt get the PDF


Learning-based interactive segmentation using the maximum mean cycle weight formalism

Sharmin Nilufar, DS Wang, John Girgis, CG Palii, D Yang, A Blais, M Brand, Doina Precup, Theodore J Perkins

Medical Imaging 2017: Image Processing 10133, 792-811


ABOVE PROBLEM can only download the PDF


Temporal Abstraction in Reinforcement Learning

PL Bacon, D Precup


2016

Practical kernel-based reinforcement learning

AMS Barreto, D Precup, J Pineau

The Journal of Machine Learning Research 17 (1), 2372-2441


Differentially private policy evaluation

B Balle, M Gomrokchi, D Precup

International Conference on Machine Learning, 2130-2138


Leveraging lexical resources for learning entity embeddings in multi-relational data

T Long, R Lowe, JCK Cheung, D Precup

arXiv preprint arXiv:1605.05416


Verb phrase ellipsis resolution using discriminative and margin-infused algorithms

K Kenyon-Dean, JCK Cheung, D Precup

Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing


Automated ongoing data validation and quality control of multi-institutional studies

LJ Kanbar, W Shalish, D Precup, K Brown, GM Sant'Anna, RE Kearney

2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)


ABOVE PROBLEM cant get the PDF


Incremental stochastic factorization for online reinforcement learning

A Barreto, R Beirigo, J Pineau, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 30 (1)


ABOVE PROBLEM can only download the PDF


A matrix splitting perspective on planning with options

PL Bacon, D Precup

arXiv preprint arXiv:1612.00916


Learning Multi-Step Predictive State Representations.

L Langer, B Balle, D Precup

IJCAI, 1662-1668


Reinforcement learning of conditional computation policies for neural networks

E Bengio, PL Bacon, R Lowe, J Pineau, D Precup

ICML Workshop on Abstractions in RL


Prediction of Cell Type Specific Transcription Factor Binding Site Occupancy

F Ahsan, D Precup, M Blanchette

Proceedings of the 7th ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics


ABOVE PROBLEM cant get the PDF


Initializing Entity Representations in Relational Models

T Long, R Lowe, J Cheung, D Precup

ICLR 2016


Special Issue on Probabilistic Models for Biomedical Image Analysis

T Arbel, MJ Cardoso, W Wells III, ACS Chung, D Precup

Computer Vision and Image Understanding 151, 1


ABOVE PROBLEM cant get the PDF


2015

Conditional computation in neural networks for faster models

E Bengio, PL Bacon, J Pineau, D Precup

arXiv preprint arXiv:1511.06297


Data generation as sequential decision making

P Bachman, D Precup

Advances in Neural Information Processing Systems 28


Approximate value iteration with temporally extended actions

TA Mann, S Mannor, D Precup

Journal of Artificial Intelligence Research 53, 375-438


ABOVE PROBLEM can only download PDF and maybe dont have access either


Method of identification and devices thereof

D Precup, J Frank, S Mannor

US Patent 8,935,195


ABOVE PROBLEM this is a patent not a paper?


Hierarchical temporal graphical model for head pose estimation and subsequent attribute classification in real-world videos

M Demirkus, D Precup, JJ Clark, T Arbel

Computer Vision and Image Understanding 136, 128-145


ABOVE PROBLEM cant get the PDF


A canonical form for weighted automata and applications to approximate minimization

B Balle, P Panangaden, D Precup

2015 30th Annual ACM/IEEE Symposium on Logic in Computer Science, 701-712


ABOVE PROBLEM cant get the PDF


IMaGe: iterative multilevel probabilistic graphical model for detection and segmentation of multiple sclerosis lesions in brain MRI

N Subbanna, D Precup, D Arnold, T Arbel

International Conference on Information Processing in Medical Imaging, 514-526


ABOVE PROBLEM cant get PDF


Classification-based approximate policy iteration

A Farahmand, D Precup, AMS Barreto, M Ghavamzadeh

IEEE Transactions on Automatic Control 60 (11), 2989-2993


Hierarchical spatio-temporal probabilistic graphical model with multiple feature fusion for binary facial attribute classification in real-world face videos

M Demirkus, D Precup, JJ Clark, T Arbel

IEEE Transactions on Pattern Analysis and Machine Intelligence 38 (6), 1185-1203


ABOVE PROBLEM cant get PDF


Training deep generative models: Variations on a theme

P Bachman, D Precup

NIPS Approximate Inference Workshop


Quantifying the determinants of outbreak detection performance through simulation and machine learning

N Jafarpour, M Izadi, D Precup, DL Buckeridge

Journal of Biomedical Informatics 53, 180-187


ABOVE PROBLEM cant get the PDF


Representation discovery for mdps using bisimulation metrics

SS Ruan, G Comanici, P Panangaden, D Precup

Twenty-Ninth AAAI Conference on Artificial Intelligence


BOVE PROBLEM can only download the PDF


Learning with options: Just deliberate and relax

PL Bacon, D Precup

NIPS Bounded Optimality and Rational Metareasoning Workshop


Feature selection and oversampling in analysis of clinical data for extubation readiness in extreme preterm infants

P Gourdeau, L Kanbar, W Shalish, G Sant'Anna, R Kearney, D Precup

2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)


Policy gradient methods for off-policy control

L Lehnert, D Precup

arXiv preprint arXiv:1512.04105


Organizational principles of cloud storage to support collaborative biomedical research

Lara J Kanbar, Wissam Shalish, Carlos A Robles-Rubio, Doina Precup, Karen Brown, Guilherme M Sant'Anna, Robert E Kearney

2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)


Basis refinement strategies for linear value function approximation in MDPs

G Comanici, D Precup, P Panangaden

Advances in Neural Information Processing Systems 28


Conditional computation in neural networks using a decision-theoretic approach

PL Bacon, E Bengio, J Pineau, D Precup

Proceedings of the 2nd Multidisciplinary Conference on Reinforcement Learning and Decision Making


ABOVE PROBLEM this is similar to another paper from 2015


Learning and Planning with Timing Information in Markov Decision Processes.

PL Bacon, B Balle, D Precup

UAI, 111-120


Testing visual attention in dynamic environments

P Bachman, D Krueger, D Precup

arXiv preprint arXiv:1510.08949


An expectation-maximization algorithm to compute a stochastic factorization from data

AMS Barreto, RL Beirigo, J Pineau, D Precup

Twenty-Fourth International Joint Conference on Artificial Intelligence


ABOVE PROBLEM can ony download the PDF


Correlation of clinical parameters with cardiorespiratory behavior in successfully extubated extremely preterm infants

Lara J Kanbar, Wissam Shalish, Carlos A Robles-Rubio, Doina Precup, Karen Brown, Guilherme M Sant'Anna, Robert E Kearney

2015 37th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC)

2014

The multimodal brain tumor image segmentation benchmark (BRATS)

Bjoern H Menze, Andras Jakab, Stefan Bauer, Jayashree Kalpathy-Cramer, Keyvan Farahani, Justin Kirby, Yuliya Burren, Nicole Porz, Johannes Slotboom, Roland Wiest, Levente Lanczi, Elizabeth Gerstner, Marc-Andre Weber, Tal Arbel, Brian B Avants, Nicholas Ayache, Patricia Buendia, D Louis Collins, Nicolas Cordier, Jason J Corso, Antonio Criminisi, Tilak Das, Herve Delingette, Çağatay Demiralp, Christopher R Durst, Michel Dojat, Senan Doyle, Joana Festa, Florence Forbes, Ezequiel Geremia, Ben Glocker, Polina Golland, Xiaotao Guo, Andac Hamamci, Khan M Iftekharuddin, Raj Jena, Nigel M John, Ender Konukoglu, Danial Lashkari, Jose Antonio Mariz, Raphael Meier, Sergio Pereira, Doina Precup, Stephen J Price, Tammy Riklin Raviv, Syed MS Reza, Michael Ryan, Duygu Sarikaya, Lawrence Schwartz, Hoo-Chang Shin, Jamie Shotton, Carlos A Silva, Nuno Sousa, Nagesh K Subbanna, Gabor Szekely, Thomas J Taylor, Owen M Thomas, Nicholas J Tustison, Gozde Unal, Flor Vasseur, Max Wintermark, Dong Hye Ye, Liang Zhao, Binsheng Zhao, Darko Zikic, Marcel Prastawa, Mauricio Reyes, Koen Van Leemput

IEEE transactions on medical imaging 34 (10), 1993-2024


ABOVE Problem there are so many authors is it worth to inlcude them all?


Algorithms for multi-armed bandit problems

V Kuleshov, D Precup

arXiv preprint arXiv:1402.6028


Learning with pseudo-ensembles

P Bachman, O Alsharif, D Precup

Advances in neural information processing systems 27


Iterative multilevel MRF leveraging context and voxel information for brain tumour segmentation in MRI

N Subbanna, D Precup, T Arbel

Proceedings of the IEEE conference on computer vision and pattern recognition


Probabilistic temporal head pose estimation using a hierarchical graphical model

M Demirkus, D Precup, JJ Clark, T Arbel

European conference on computer vision, 328-344


A new Q (lambda) with interim forward view and Monte Carlo equivalence

R Sutton, AR Mahmood, D Precup, H Hasselt

International Conference on Machine Learning, 568-576


Bisimulation Metrics are Optimal Value Functions.

N Ferns, D Precup

UAI, 210-219


Policy iteration based on stochastic factorization

AMS Barreto, J Pineau, D Precup

Journal of Artificial Intelligence Research 50, 763-803


ABOVE PROBLEM can only download the PDF


Multi-layer temporal graphical model for head pose estimation in real-world videos

M Demirkus, D Precup, JJ Clark, T Arbel

2014 IEEE International Conference on Image Processing (ICIP), 3392-3396


Bisimulation for Markov decision processes through families of functional expressions

N Ferns, D Precup, S Knight

Horizons of the Mind. A Tribute to Prakash Panangaden, 319-342


Pooled screening for synergistic interactions subject to blocking and noise

K Li, D Precup, TJ Perkins

Plos one 9 (1), e85864


ABOVE PROBLEM can only download a PDF


Bayesian and GrAphical Models for Biomedical Imaging

I Simpson, T Arbel, D Precup, A Ribbens, MJ Cardoso

Springer


ABOVE PROBLEM there is no PDF


Optimizing energy production using policy search and predictive state representations

Y Grinberg, D Precup, M Gendreau

Advances in Neural Information Processing Systems 27


Sample-based approximate regularization

P Bachman, A Farahmand, D Precup

International Conference on Machine Learning, 1926-1934


Bayesian and GrAphical Models for Biomedical Imaging: First International Workshop, BAMBI 2014, Cambridge, MA, USA, September 18, 2014, Revised Selected Papers

MJ Cardoso, I Simpson, T Arbel, D Precup, A Ribbens

Springer


ABOVE PROBLEM cant get the PDF


Analyzing User Trajectories from Mobile Device Data with Hierarchical Dirichlet Processes

N Ghourchian, D Precup

Canadian Conference on Artificial Intelligence, 107-118


ABOVE PROBLEM cent get the PDF


Theoretical results on the effect of ‘shortcut’actions in MDPs

SM McCarthy, D Precup

Connection Science 26 (2), 179-193


2013

Learning from limited demonstrations

B Kim, A Farahmand, J Pineau, D Precup

Advances in Neural Information Processing Systems 26


Hierarchical probabilistic Gabor and MRF segmentation of brain tumours in MRI volumes

NK Subbanna, D Precup, DL Collins, T Arbel

International conference on medical image computing and computer-assisted intervention


Smart exploration in reinforcement learning using absolute temporal difference errors

C Gehring, D Precup

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems


Assessing the predictability of hospital readmission using machine learning

A Hosseinzadeh, M Izadi, A Verma, D Precup, D Buckeridge

Twenty-fifth IAAI conference


ABOVE PROBLEM can only download the PDF


Bellman error based feature generation using random projections on sparse spaces

M Milani Fard, Y Grinberg, A Farahmand, J Pineau, D Precup

Advances in Neural Information Processing Systems 26


Using hierarchical mixture of experts model for fusion of outbreak detection methods

N Jafarpour, D Precup, M Izadi, D Buckeridge

AMIA Annual Symposium Proceedings 2013, 663


Generating storylines from sensor data

J Frank, S Mannor, D Precup

Pervasive and Mobile Computing 9 (6), 838-847


Using label propagation for learning temporally abstract actions in reinforcement learning

PL Bacon, D Precup

Proceedings of the Workshop on Multiagent Interaction Networks, 1-7


An empirical analysis of off-policy learning in discrete mdps

C Păduraru, D Precup, J Pineau, G Comănici

European Workshop on Reinforcement Learning, 89-102


Reinforcement learning competition: Helicopter hovering with controllability and kernel-based stochastic factorization

A Asbah, AMS Barreto, C Gehring, J Pineau, D Precup

Proc. Int. Conf. Mach. Learn.(ICML) Reinforcement Learn. Competition Workshop


Greedy confidence pursuit: A pragmatic approach to multi-bandit optimization

P Bachman, D Precup

Joint European Conference on Machine Learning and Knowledge Discovery in Databases


Average reward optimization objective in partially observable domains

Y Grinberg, D Precup

International Conference on Machine Learning, 320-328


Mining hospital admission-discharge data to discover the chance of readmission

A Hosseinzadeh


ABOVE PROBLEM can only download PDF and not on authors list


Approximate Policy Iteration with Demonstration Data

B Kim, A Farahmand, J Pineau, D Precup

RLDM 2013, 168


Determinants of Outbreak Detection Performance

N Jafarpour, D Precup, D Buckeridge

Online Journal of Public Health Informatics 5 (1)


ABOVE PROBLEM can only download PDF


Smart Classifier Selection for Activity Recognition on Wearable Devices.

N Ghourchian, D Precup

ICPRAM, 581-585


Predictability Analysis of Hospital Readmission Using Machine Learning Techniques

A Hosseinzadeh, M Izadi, A Verma, D Precup, D Buckeridge

School of Computer Science, McGill University


ABOVE PROBLEM this paper seems similar to two others from 2013


2012

An information-theoretic approach to curiosity-driven reinforcement learning

S Still, D Precup

Theory in Biosciences 131 (3), 139-148


Time series analysis using geometric template matching

J Frank, S Mannor, J Pineau, D Precup

IEEE transactions on pattern analysis and machine intelligence 35 (3), 740-754


ABOVE PROBLEM cant get the PDF


Methods for computing state similarity in Markov decision processes

N Ferns, PS Castro, D Precup, P Panangaden

arXiv preprint arXiv:1206.6836


Metrics for Markov decision processes with infinite state spaces

N Ferns, P Panangaden, D Precup

arXiv preprint arXiv:1207.1386


Compressed leasÆ’t-squares regression on sparse spaces

MM Fard, Y Grinberg, J Pineau, D Precup

Twenty-Sixth AAAI Conference on Artificial Intelligence


ABOVE PROBLEM can only download the PDF


Prediction of extubation readiness in extreme preterm infants based on measures of cardiorespiratory variability

Doina Precup, Carlos A Robles-Rubio, Karen A Brown, L Kanbar, Jennifer Kaczmarek, Sanjay Chawla, Guilherme M Sant'Anna, Robert E Kearney

2012 Annual international conference of the IEEE Engineering in Medicine and Biology Society


A machine learning approach to the detection of fetal hypoxia during labor and delivery

PA Warrick, EF Hamilton, RE Kearney, D Precup

Ai Magazine 33 (2), 79-79


ABOVE PROBLEM can only download PDF


On-the-fly algorithms for bisimulation metrics

G Comanici, P Panangaden, D Precup

2012 Ninth International Conference on Quantitative Evaluation of Systems


Soft biometric trait classification from real-world face videos conditioned on head pose estimation

M Demirkus, D Precup, JJ Clark, T Arbel

2012 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops


Value pursuit iteration

A Farahmand, D Precup

Advances in Neural Information Processing Systems 25


On-line reinforcement learning using incremental kernel-based stochastic factorization

A Barreto, D Precup, J Pineau

Advances in Neural Information Processing Systems 25


Generalized classification-based approximate policy iteration

A Farahmand, D Precup, M Ghavamzadeh

Tenth European Workshop on Reinforcement Learning (EWRL) 2


Improved estimation in time varying models

D Precup, P Bachman

arXiv preprint arXiv:1206.6385


A study of off-policy learning in computational sustainability

C Paduraru, D Precup, J Pineau, G Comanici

European Workshop on Reinforcement Learning (EWRL) 24, 89-102


Mining administrative data to predict falls in the elderly population

A Hosseinzadeh, M Izadi, D Precup, D Buckeridge

Canadian Conference on Artificial Intelligence, 305-311


On average reward policy evaluation in infinite-state partially observable systems

Y Grinberg, D Precup

Artificial Intelligence and Statistics, 449-457


Random projections preserve linearity in sparse spaces

MM Fard, Y Grinberg, J Pineau, D Precup

School of Computer Science, Mcgill University, Tech. Rep


Csaba Szepesvári University of Alberta

A Geramifard, A Lazaric, A Farahmand, AD Salles, AMH Barreto, ...


ABOVE PROBLEM idk what the above thing even is , doesnt seem to be a paper


Reports of the AAAI 2011 conference workshops

Noa Agmon, Vikas Agrawal, David W Aha, Yiannis Aloimonos, Donagh Buckley, Prashant Doshi, Christopher Geib, Floriana Grasso, Nancy Green, Benjamin Johnston, Burt Kaliski, Christopher Kiekintveld, Edith Law, Henry Lieberman, Ole J Mengshoel, Ted Metzler, Joseph Modayil, Douglas W Oard, Nilufer Onder, Barry O'Sullivan, Katerina Pastra, Doina Precup, Sowmya Ramachandran, Chris Reed, Sanem Sariel-Talay, Ted Selker, Lokendra Shastri, Stephen F Smith, Satinder Singh, Siddharth Srivastava, Gita Sukthankar, David C Uthus, Mary-Anne Williams

AI Magazine 33 (1), 57-70


ABOVE PROBLEM again there are a ton of authors and im also not sure this is a paper either.

2011

Horde: A scalable real-time architecture for learning knowledge from unsupervised sensorimotor interaction

RS Sutton, J Modayil, M Delp, T Degris, PM Pilarski, A White, D Precup

The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume 2


Bisimulation metrics for continuous Markov decision processes

N Ferns, P Panangaden, D Precup

SIAM Journal on Computing 40 (6), 1662-1714


ABOVE PROBLEM similar to two other papers from 2011


Activity recognition with mobile phones

J Frank, S Mannor, D Precup

Joint European Conference on Machine Learning and Knowledge Discovery in Databases


ABOVE PROBLEM title seems similar to a 2013 paper but they are probably unrelateed (Activity recognition with wearable devices or sum)


Reinforcement learning using kernel-based stochastic factorization

A Barreto, D Precup, J Pineau

Advances in Neural Information Processing Systems 24


Automatic construction of temporally extended actions for mdps using bisimulation metrics

PS Castro, D Precup

European Workshop on Reinforcement Learning, 140-152


Basis function discovery using spectral clustering and bisimulation metrics

G Comanici, D Precup

International Workshop on Adaptive and Learning Agents, 85-99


ABOVE PROBLEM can only download PDF


The duality of state and observation in probabilistic transition systems

M Dinculescu, C Hundt, P Panangaden, J Pineau, D Precup

International Tbilisi Symposium on Logic, Language, and Computation, 206-230


A framework for computing bounds for the return of a policy

C Păduraru, D Precup, J Pineau

European Workshop on Reinforcement Learning, 201-212


Adapted MRF Segmentation of Multiple Sclerosis Lesions Using Local Contextual Information.

NK Subbanna, SJ Francis, D Precup, DL Collins, DL Arnold, T Arbel

MIUA, 351-356


Activity Recognition with Time-Delay Emobeddings

J Frank, S Mannor, D Precup

AAAI Spring Symposium: Computational Physiology


Learning compact representations of time-varying processes

P Bachman, D Precup

Proceedings of the AAAI Conference on Artificial Intelligence 25 (1), 1748-1749


ABOVE PROBLEM can only download PDF

2010

Activity and gait recognition with time-delay embeddings

J Frank, S Mannor, D Precup

Twenty-Fourth AAAI Conference on Artificial Intelligence


ABOVE PROBLEM cant get the PDF


Classification of normal and hypoxic fetuses from systems modeling of intrapartum cardiotocography

PA Warrick, EF Hamilton, D Precup, RE Kearney

IEEE Transactions on Biomedical Engineering 57 (4), 771-779


ABOVE PROBLEM cant get the PDF


Using bisimulation for policy transfer in MDPs

PS Castro, D Precup

Twenty-Fourth AAAI Conference on Artificial Intelligence


ABOVE PROBLEM cant get the PDF


Optimal policy switching algorithms for reinforcement learning

G Comanici, D Precup

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1-Volume 1


Smarter sampling in model-based Bayesian reinforcement learning

PS Castro, D Precup

Joint European Conference on Machine Learning and Knowledge Discovery in Databases


Automatically suggesting topics for augmenting text documents

R West, D Precup, J Pineau

Proceedings of the 19th ACM international conference on Information and knowledge management


Approximate predictive representations of partially observable systems

M Dinculescu, D Precup

ICML


A novel similarity measure for time series data with applications to gait and activity recognition

J Frank, S Mannor, D Precup

Proceedings of the 12th ACM international conference adjunct papers on Ubiquitous computing-Adjunct


An approach to inference in probabilistic relational models using block sampling

F Kaelin, D Precup

Asian Conference on Machine Learning (ACML), 325-340


A study of approximate inference in probabilistic relational models

F Kaelin, D Precup

Proceedings of 2nd Asian Conference on Machine Learning, 315-330


An algebraic approach to dynamic epistemic logic

P Panangaden, C Phillips, D Precup, M Sadrzadeh

23rd International Workshop on Description Logics DL2010, 443


Convergent Temporal-Difference Learning with Arbitrary Differentiable Function Approximator

HR Maei, C Szepesvári, S Bhathnagar, D Silver, D Precup, R Sutton


ABOVE Problem, cant seem to find how to view pdf

2009

Fast gradient-descent methods for temporal-difference learning with linear function approximation

Richard S Sutton, Hamid Reza Maei, Doina Precup, Shalabh Bhatnagar, David Silver, Csaba Szepesvári, Eric Wiewiora

Proceedings of the 26th annual international conference on machine learning


Convergent temporal-difference learning with arbitrary smooth function approximation

H Maei, C Szepesvari, S Bhatnagar, D Precup, D Silver, RS Sutton

Advances in neural information processing systems 22


Wikispeedia: An online game for inferring semantic distances between concepts

R West, J Pineau, D Precup

Twenty-First International Joint Conference on Artificial Intelligence


ABOVE PROBLEM cant get the PDF


Completing Wikipedia's hyperlink structure through dimensionality reduction

R West, D Precup, J Pineau

Proceedings of the 18th ACM conference on Information and knowledge management


ABOVE PROBLEM cant get the PDF (page doesnt exist anymore)


Identification of the dynamic relationship between intrapartum uterine pressure and fetal heart rate for normal and hypoxic fetuses

PA Warrick, EF Hamilton, D Precup, RE Kearney

IEEE Transactions on Biomedical Engineering 56 (6), 1587-1597


Equivalence relations in fully and partially observable Markov decision processes

PS Castro, P Panangaden, D Precup

Twenty-First International Joint Conference on Artificial Intelligence


Notions of state equivalence under partial observability

PS Castro, P Panangaden, D Precup

Proceedings of the 21st International Joint Conference on Artificial Intelligence (IJCAI-09)


Learning the difference between partially observable dynamical systems

S Zhioua, D Precup, F Laviolette, J Desharnais

Joint European Conference on Machine Learning and Knowledge Discovery in Databases



2008

Bounding performance loss in approximate MDP homomorphisms

J Taylor, D Precup, P Panagaden

Advances in Neural Information Processing Systems 21


Reinforcement learning in the presence of rare events

J Frank, S Mannor, D Precup

Proceedings of the 25th international conference on Machine learning, 336-343


ABOVE PROBLEM can only download the PDF


Point-based planning for predictive state representations

MT Izadi, D Precup

Conference of the Canadian Society for Computational Studies of Intelligence


Anytime similarity measures for faster alignment

R Brooks, T Arbel, D Precup

Computer Vision and Image Understanding 110 (3), 378-389


ABOVE PROBLEM cant get the PDF


Model-based reinforcement learning with state aggregation

C Paduraru, R Kaplow, D Precup, J Pineau

8th European Workshop on Reinforcement Learning


Detecting the temporal extent of the impulse response function from intra-partum cardiotocography for normal and hypoxic fetuses

PA Warrick, EF Hamilton, D Precup, RE Kearney

2008 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society


ABOVE PROBLEM cant get the PDF


Classification of normal and hypoxic fetuses using system identification from intra-partum cardiotocography

PA Warrick, EF Hamilton, RE Kearney, D Precup

ICML 2008 Workshop Mach. Learning Health Care Appl. Helsinki, Finland


ABOVE PROBLEM this paper seems very similar to a 2010 one

2007

Partial model completion in model driven engineering using constraint logic programming

S Sen, B Baudry, D Precup

17th International Conference on Applications of Declarative Programming and Knowledge Management (INAP 2007) and 21st Workshop on (Constraint)


Using Linear Programming for Bayesian Exploration in Markov Decision Processes.

PS Castro, D Precup

IJCAI 24372442


A formal framework for robot learning and control under model uncertainty

R Jaulmes, J Pineau, D Precup

Proceedings 2007 IEEE International Conference on Robotics and Automation


Fast Image Alignment Using Anytime Algorithms.

R Brooks, T Arbel, D Precup

IJCAI, 2078-2083


Low-order parametric system identification for intrapartum uterine pressure-fetal heart rate interaction

PA Warrick, RE Kearney, D Precup, EF Hamilton

2007 29th Annual International Conference of the IEEE Engineering in Medicine and Biology Society


ABOVE PROBLEM cant get the PDF


An adaptive approach to optimize software component quality predictive models: Case of stability

D Azar, D Precup

New Technologies, Mobility and Security, 297-310


ABOVE PROBLEM cant get the PDF


Time progression of a parametric impulse response function estimate from intra-partum cardiotocography for normal and hypoxic fetuses

PA Warrick, RE Kearney, D Precup, EF Hamilton

2007 Computers in Cardiology, 693-696


Context-Driven Predictions

MG Bellemare, D Precup

IJCAI, 250-255


Fetal heart rate deceleration detection using a discrete cosine transform implementation of singular spectrum analysis

PA Warrick, D Precup, EF Hamilton, RE Kearney

Methods of Information in Medicine 46 (02), 196-201


ABOVE PROBLEM can only download the PDF


Representing Systems with Hidden State

DK HAGHIGHI, C HUNDT, P PANANGADEN, J PINEAU, D PRECUP

AAAI Fall 2007 symposium


ABOVE PROBLEM there is no venue for it but there is a PDF, venue maybe AAAI symposium 2007?


HEART RATE VARIABILITY-Fetal Heart Rate Deceleration Detection Using a Discrete Cosine Transform Implementation of Singular Spectrum Analysis

PA Warrick, D Precup, EF Hamilton, RE Kearney

Methods of Information in Medicine 46 (2), 196


ABOVE PROBLEM this seems very similar to another 2007 paper and i can only download th PDF


Apprentissage actif dans les processus décisionnels de Markov partiellement observables: L'algorithme MEDUSA

R Jaulmes, J Pineau, D Precup

Revue d'intelligence artificielle 21 (1), 9-33


ABOVE PROBLEM i have no idea how to get the PDF

2006

Automatic basis function construction for approximate dynamic programming and reinforcement learning

PW Keller, S Mannor, D Precup

Proceedings of the 23rd international conference on Machine learning, 449-456


Data mining using relational database management systems

B Zou, X Ma, B Kemme, G Newton, D Precup

Pacific-asia conference on knowledge discovery and data mining, 657-667


PAC-learning of Markov models with hidden state

R Gavaldà, PW Keller, J Pineau, D Precup

European Conference on Machine Learning, 150-161


Representing Systems with Hidden State.

C Hundt, P Panangaden, J Pineau, D Precup

AAAI, 368-374


ABOVE PROBLEM this seems similar to a paper from 2007


Belief selection in point-based planning algorithms for POMDPs

MT Izadi, D Precup, D Azar

Conference of the Canadian Society for Computational Studies of Intelligence


System-Identification noise suppression for intra-partum cardiotocography to discriminate normal and hypoxic fetuses

PA Warrick, RE Kearney, D Precup, EF Hamilton

2006 Computers in Cardiology, 937-940


Exploration in POMDP belief space and its impact on value iteration approximation

MT Izadi, D Precup

European conference on artificial intelligence (ECAI 06), workshop on planning, learning and monitoring with uncertainty and dynamic worlds (PLMUDW)


Linear models of intrapartum uterine pressure-fetal heart rate interaction for the normal and hypoxic fetus

PA Warrick, RE Kearney, D Precup, EF Hamilton

2006 International Conference of the IEEE Engineering in Medicine and Biology Society


ABOVE PROBLEm this seems very similar to a 2007 paper and i cant get the PDF


Fetal Heart Rate Deceleration Detection from the Discrete Cosine Transform Spectrum

PA Warrick, D Precup, EF Hamilton, RE Kearney

2005 IEEE Engineering in Medicine and Biology 27th Annual Conference, 5555-5558


ABOVE PROBLEM (this paper seems very simialr to one form 2007)


Automatic basis function construction for approximate dynamic programming and reinforcement learning

S Mannor, D Precup

In Cohen and Moore (2006


ABOVE PROBLEM i cant get the PDF (i dont know what this is) and it is similar to another 2006 paper


RedAgent: An Autonomous, Market-based Supply-Chain Management Agent for the Trading Agents Competition

D Precup, PW Keller, FO Duguay

Multiagent based Supply Chain Management, 135-154


ABOVE PROBLEM cant get the PDF


Duality of State and Observations

C Hundt, P Panangaden, J Pineau, D Precup, M Dinculescu

Research supported by NSERC and CFI


2005

Active learning in partially observable markov decision processes

R Jaulmes, J Pineau, D Precup

European Conference on Machine Learning, 601-608


Learning in non-stationary partially observable Markov decision processes

R Jaulmes, J Pineau, D Precup

ECML Workshop on Reinforcement Learning in non-stationary environments 25, 26-32


Off-policy learning with options and recognizers

D Precup, C Paduraru, A Koop, RS Sutton, S Singh

Advances in Neural Information Processing Systems 18


Using rewards for belief state updates in partially observable markov decision processes

MT Izadi, D Precup

European Conference on Machine Learning, 593-600


Risk-directed exploration in reinforcement learning

ELM Law


ABOVE PROBLEM not listed as an author of the paper


An approximation algorithm for labelled Markov processes: towards realistic approximation

A Bouchard-Côté, N Ferns, P Panangaden, D Precup

Second International Conference on the Quantitative Evaluation of Systems (QEST'05)


Using core beliefs for point-based value iteration.

MT Izadi, VR Ajit, D Precup

IJCAI, 1751-1753


Model minimization by linear PSR.

MT Izadi, D Precup

IJCAI, 1749-1750


Probabilistic robot planning under model uncertainty: an active learning approach

R Jaulmes, J Pineau, D Precup

NIPS Workshop on Machine Learning Based Robotics in Unstructured Environments


Active learning in POMDPs

R Jaulmes, J Pineau, D Precup

Proceedings of ECML-05


ABOVE PROBLEM cant get the PDF


The workshop program at the nineteenth national conference on artificial intelligence

Ion Muslea, Virginia Dignum, Daniel Corkill, Catholijn Jonker, Frank Dignum, Silvia Coradeschi, Alessandro Saffiotti, Dan Fu, Jeff Orkin, William E Cheetham, Kai Goebel, Piero Bonissone, Leen-Kiat Soh, Randolph M Jones, Robert E Wray, Matthias Scheutz, Daniela Pucci de Farias, Shie Mannor, Georgios Theocharou, Doina Precup, Bamshad Mobasher, Sarabjot Singh Anand, Bettina Berendt, Andreas Hotho, Hans Guesgen, Michael T Rosenstein, Mohammad Ghavamzadeh

AI Magazine 26 (1), 103-103


ABOVE PROBLEM there are a ton of authors and can only download the PDF

2004

Metrics for Finite Markov Decision Processes.

N Ferns, P Panangaden, D Precup

UAI 4, 162-169


Sparse distributed memories for on-line value-based reinforcement learning

B Ratitch, D Precup

European Conference on Machine Learning, 347-358


Redagent-2003: An autonomous market-based supply-chain management agent

PW Keller, FO Duguay, D Precup

Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems-Volume 3


Redagent: winner of TAC SCM 2003

PW Keller, FO Duguay, D Precup

ACM SIGecom Exchanges 4 (3), 1-8


ABOVE PROBLEM this paper seems similar to another 2004 one (the one right above)


Improving rule set based software quality prediction: A genetic algorithm-based approach

S Bouktif, D Azar, D Precup, H Sahraoui, B Kegl

Journal of Object Technology 3 (4), 227-241


Sparse distributed memories in reinforcement learning: Case studies

B Ratitch, S Mahadevan, D Precup

Proc. of the Workshop on Learning and Planning in Markov Processes-Advances and Challenges


ABOVE PROBLEM veyr similar to another paper the same year


Classification using Φ-machines and constructive function approximation

D Precup, PE Utgoff

Machine Learning 55 (1), 31-52


Reasoning and Learning Lab: Technical Report RL-3.04 Funded by a NSERC Undergraduate Student Research Award

MG Bellemare, D Precup, F Rivest


2003

Combining TD-learning with cascade-correlation networks

F Rivest, D Precup

Proceedings of the 20th International Conference on Machine Learning (ICML-03)


A planning algorithm for predictive state representations

MT Izadi, D Precup

IJCAI, 1520-1521


Using MDP characteristics to guide exploration in reinforcement learning

B Ratitch, D Precup

European Conference on Machine Learning, 313-324


ABOVE PROBLEM veyr similar to another paper in the same year


Improving Rule Set Based Software Quality Prediction

D Azar, S Bouktif, H Sahraoui, B Kegl


ABOVE PROBLEM seems similar to a 2004 paper


Exploration in RL using MDP characteristics

B RATITCH, D PRECUP

EWRL-6'2003: European workshop on reinforcement learning (Nancy, 4-5 …


ABOVE PROBLEM similar to another paper from the same year and also cant get the PDF

2002

Learning options in reinforcement learning

M Stolle, D Precup

International Symposium on abstraction, reformulation, and approximation


ABOVE PROBLEM cant get the PDF

A convergent form of approximate policy iteration

T Perkins, D Precup

Advances in neural information processing systems 15


Using finite experiments to study asymptotic performance

C McGeoch, P Sanders, R Fleischer, PR Cohen, D Precup

Experimental algorithmics, 93-126


Combining and adapting software quality predictive models by genetic algorithms

D Azar, D Precup, S Bouktif, B Kégl, H Sahraoui

Proceedings 17th IEEE International Conference on Automated Software Engineering


ABOVE PROBLEM cant get the PDF


Characterizing Markov decision processes

B Ratitch, D Precup

European Conference on Machine Learning, 391-404


DEVELOPING COLLABORATIVE GOLOG AGENTS BY REINFORCEMENT LEARNING

IA Letia, D Precup

International Journal on Artificial Intelligence Tools 11 (3), 473


ABOVE PROBLEM this [aper is very similar to one from 2001


2001

Off-policy temporal-difference learning with function approximation

D Precup, RS Sutton, S Dasgupta

ICML, 417-424


Developing collaborative Golog agents by reinforcement learning

LA Letia, D Precup

Proceedings 13th IEEE International Conference on Tools with Artificial Intelligence. ICTAI 2001


ABOVE PROBLEM cant get the PDF


Searching for Big-Oh in the data: Inferring asymptotic complexity from experiments

C Mc Geoch, P Sanders, R Fleischer, PR Cohen, D Precup

Lecture notes in computer science: Proceedings of the dagstuhl seminar on experimental algorithmics


Experimental Algorithmics: From Algorithm Design to Robust and Efficient Software

Catherine McGeoch, Peter Sanders, Rudolf Fleischer, Paul R Cohen, Doina Precup, Benita M Beamon, Victoria CP Chen, D Chakrabarti, C Faloutsos, Daniel Frey, Jens Nimis, Heinz Worn, Peter Lockemann, Victor Munoz, Javier Murillo, Didac Busquets, Beatriz Lopez, Rajdeep K Dash, Nicholas R Jennings, David C Parkes, SP Singh, MK Tiwara, A Larsen, O Madsen, M Solomon, Ashok U Mallya, Munindar P Singh, Birgit Heydenreich, Rudolf Muller, Marc Uetz, Yasmina Abdeddaim, Oded Maler

International Journal of Advanced Manufacturing Technology 2102, 478-492


ABOVE PROBLEM there are a ton of authors and cant get the PDF

2000

Eligibility traces for off-policy policy evaluation

D Precup

Computer Science Department Faculty Publication Series, 80


Temporal abstraction in reinforcement learning

D Precup

University of Massachusetts Amherst


ABOVE PROBLEM cant get the PDF

1999

Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning

RS Sutton, D Precup, S Singh

Artificial intelligence 112 (1-2), 181-211


Using options for knowledge transfer in reinforcement learning

TJ Perkins, D Precup

Technical Report UM-CS-1999-034, The University of Massachusetts at Amherst

1998

Intra-Option Learning about Temporally Abstract Actions.

RS Sutton, D Precup, S Singh

ICML 98, 556-564


Theoretical results on reinforcement learning with temporally abstract options

D Precup, RS Sutton, S Singh

European conference on machine learning, 382-393


Improved switching among temporally abstract actions

RS Sutton, S Singh, D Precup, B Ravindran

Advances in neural information processing systems 11


Constructive function approximation

PE Utgoff, D Precup

Feature Extraction, Construction and Selection, 219-235


Hierarchical optimal control of MDPs

A McGovern, D Precup, B Ravindran, S Singh, RS Sutton

Proceedings of the Tenth Yale Workshop on Adaptive and Learning Systems, 186-191


14 CONSTRUCTIVE FUNCTION

PE Utgoff, D Precup

Feature Extraction, Construction and Selection: A Data Mining Perspective


ABOVE PROBLEM cant get the PDF

1997

Multi-time models for temporally abstract planning

D Precup, RS Sutton

Advances in neural information processing systems 10


Learning to schedule straight-line code

J Moss, Paul Utgoff, John Cavazos, Doina Precup, Darko Stefanovic, Carla Brodley, David Scheeff

Advances in Neural Information Processing Systems 10


Planning with closed-loop macro actions

D Precup, RS Sutton, SP Singh

Working notes of the 1997 AAAI Fall Symposium on Model-directed Autonomous Systems


Exponentiated gradient methods for reinforcement learning

D Precup, RS Sutton

ICML, 272-277


Multi-time models for reinforcement learning

D Precup, RS Sutton

Proceedings of the ICML’97 Workshop on Modelling in Reinforcement Learning


Classification using-machines and constructive function approximation

D Precup, PE Utgo

Proceedings of the 15th International Conference on Machine Learning, 439-444


How to find big-oh in your data set (and how not to)

CC McGeoch, PR Cohen

Sixth International Workshop on Artificial Intelligence and Statistics, 347-354


Relative value function approximation

PE Utgo, D Precup

Tech. rep., Dept of Computer Science, Univ of Mass

1996

Empirical Comparison of Gradient Descent and Exponentiated Gradient Descent in Supervised and Reinforcement Learning Technical Report 96-70

D Precup, R Sutton


ABOVE PROBLEM can onyl download PDF